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Course objectives 



In order to reach the more interesting and useful ideas, we shall adopt a fairly 
brutal approach to some early material. Lengthy proofs will sometimes be left 
out, though full versions will be made available. By the end of the course, you 
should have a good understanding of normed vector spaces, Hilbcrt and Banach 
spaces, fixed point theorems and examples of function spaces. These ideas will be 
illustrated with applications to differential equations. 

Books 

You do not need to buy a book for this course, but the following may be useful for 
background reading. If you do buy something, the starred books are recommended 
[1] Functional Analysis, W. Rudin, McGraw-Hill (1973). This book is thorough, 
sophisticated and demanding. 

[2] Functional Analysis, F. Riesz and B. Sz.-Nagy, Dover (1990). This is a classic 
text, also much more sophisticated than the course. 

[3]* Foundations of Modern Analysis, A. Friedman, Dover (1982). Cheap and 
cheerful, includes a useful few sections on background. 

[4]* Essential Results of Functional Analysis, R.J. Zimmer, University of Chicago 

Press (1990). Lots of good problems and a useful chapter on background. 

[5]* Functional Analysis in Modern Applied Mathematics, R.F. Curtain and A.J. 

Pritchard, Academic Press (1977). This book is closest to the course. 

[6]* Linear Analysi, B. Bollobas, Cambridge University Press (1995). This book is 

excellent but makes heavy demands on the reader. 
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CHAPTER 1 



Normed Linear Spaces 

A linear space is simply an abstract version of the familiar vector spaces R, R 2 , 
R 3 and so on. Recall that vector spaces have certain algebraic properties: vectors 
may be added, multiplied by scalars, and vector spaces have bases and subspaces. 
Linear maps between vector spaces may be described in terms of matrices. Using the 
Euclidean norm or distance, vector spaces have other analytic properties (though 
you may not have called them that): for example, certain functions from R to R 
are continuous, diffcrcntiablc, Ricmann intcgrable and so on. 

We need to make three steps of generalization. 
Bases: The first is familiar: instead of, for example, R 3 , we shall sometimes want 
to talk about an abstract three-dimensional vector space V over the field R. This 
distinction amounts to having a specific basis {e!,e2,e 3 } in mind, in which case 
every element of V corresponds to a triple (a, b, c) = aei + be2 + ce% of reals - or 
choosing not to think of a specific basis, in which case the elements of V are just 
abstract vectors v. In the abstract language we talk about linear maps or operators 
between vector spaces; after choosing a basis linear maps become matrices - though 
in an infinite dimensional setting it is rarely useful to think in terms of matrices. 
Ground fields: The second is fairly trivial and is also familiar: the ground field 
can be any field. We shall only be interested in R (real vector spaces) and C 
(complex vector spaces). Notice that C is itself a two-dimensional vector space 
over R with additional structure (multiplication). Choosing a basis for C 

over R we may identify zeC with the vector (3?(z),3(z)) e R 2 . 
Dimension: In linear algebra courses, you deal with finite dimensional vector 
spaces. Such spaces (over a fixed ground field) are determined up to isomor- 
phism by their dimension. We shall be mainly looking at linear spaces that are 
not finite-dimensional, and several new features appear. All of these features may 
be summed up in one line: the algebra of infinite dimensional linear spaces is in- 
timately connected to the topology. For example, linear maps between R 2 and R 2 
are automatically continuous. For infinite dimensional spaces, some linear maps 
arc not continuous. 

1. Linear (vector) spaces 

Definition 1.1. A linear space over a field k is a set V equipped with maps 
® :V xV ^ V and • : k x V — > V with the properties 

(1) x © y — y © x for all x, y £ V (addition is commutative); 

(2) (x © y) © z = x © (y © z) for all x,y,z (addition is associative); 

(3) there is an element e V such that x © — © x = x for all x e V (a zero 
clement); 

(4) for each x e V there is a unique element — x e V with x © (—a;) = (additive 
inverses); 

5 



6 



1. NORMED LINEAR. SPACES 



(notice that (V, +) therefore forms an abelian group) 

(5) a ■ ((3 ■ x) — (a/3) ■ x for all a, [3 <G k and x 

(6) (a + [3) ■ x = a ■ x © (3 ■ x for all a, /3 e k and a; e V (scalar multiplication 
distributes over scalar addition); 

(7) a ■ (x © y) — a ■ x a ■ y for all a e k and x, y € V (scalar multiplication 
distributes over vector addition); 

(8) 1 • x = x for all x e V where 1 is the multiplicative identity in the field k. 

Example 1.1. [1] Let V = M n = {x = (xi, . . . ,x n ) | Xi e M} with the usual 
vector addition and scalar multiplication. 

[2] Let V be the set of all polynomials with coefficients in M of degree < n with 
usual addition of polynomials and scalar multiplication. 

[3] Let V be the set M( mj „)(C) of complex-valued m x n matrices, with usual 
addition of matrices and scalar multiplication. 

[4] Let denote the set of infinite sequences (x\, X2, £3, • • • ) that are bounded: 
sup{\x n \} < 00. Then is linear space, since sup{|a;„ + y n \} < sup{|x„|} + 
sup{|y„|} < 00 and sup{|ax„|} = \a\ sup{|x„|}. 

[5] Let C(S) be the set of continuous functions / : S — > M with addition (f®g)(x) = 
f(x) + g(x) and scalar multiplication (a ■ f)(x) — af(x). Here S is, for example, 
any subset of M. The dimension of C(S) is infinite if S is an infinite set, and is 
exactly IS 1 ) if not 1 . 

[6] Let V be the set of Riemann-integrable functions / : (0, 1) — > M which are 
square-integrable: that is, with the property that J \f(x)\ 2 dx < 00. We need to 
check that this is a linear space. Closure under scalar multiplication is clear: if 
la \.f( x )\ 2 dx < 00 and a e M then J* \af(x)\ 2 dx = \a\ 2 \ f{x)\ 2 dx < 00. For 
vector addition we need the Cauchy-Schwartz inequality: 

[ 1 \.f(x)+g(x)\ 2 dx < f 1 (\f(x)\ 2 + 2\f(x)\\g(x)\ + \g(x)\ 2 )dx 
Jo Jo 

< J\f{x)\ 2 dx+^j\f{x)\ 2 d x y ^J\g{x)\ 2 d x y 

+ [ \g(x) \ 2 dx < 00. 
Jo 

[7] Let C°°[a,b] be the space of infinitely differentiable functions on [a, b]. 
[8] Let n be a subset of M n , and C k {tt) the space of k times continuously differen- 
tiable functions. This means that if a = (a\, . . . , a n ) G N n has |a| = a\ H h a n < 

k, then the partial derivatives 

g|a| f 

D a f = — J. 

J dxl 1 . . . Dxt 

exist and are continuous. 

From now on we will drop the special notation ©, - for vector addition and 
scalar multiplication. We will also (normally) use plain letters x,y and so on for 
elements of linear spaces. 



1 This may be seen as follows. If S = {s\ , . . . , s„} is finite, then the map that sends a function 
/ 6 C(S) to the vector (/(si), ■ • ■ , /(s n )) 6 IR n is an isomorphism of linear spaces. If S is infinite, 
then the map that sends a polynomial / £ M[x] to the function / £ C(S) is injective (since 
two polynomials that agree on infinitely many values must be identical). This shows that C(S) 
contains an isomorphic copy of an infinite-dimensional space, so must be infinite-dimensional. 
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As in the linear algebra of finite-dimensional vector spaces, subsets of linear 
spaces that are themselves linear spaces are called linear subspaces. 



Definition 1.2. Let V be a linear space over the field k. A subset W c V 
is a linear subspace of V if for all x,y e W and a,/3 G fc, the linear combination 
arc + /3y e W. 

Example 1.2. [1] The set of vectors in R" of the form (a;i, a; 2 , x 3 , 0, . . . , 0) 
forms a three-dimensional linear subspace. 

[2] The set of polynomials of degree < r forms a linear subspace of the the set of 
polynomials of degree < n for any r < n. 



[3] (cf. Example 1.1(8)) The space C k+1 {£l) is a linear subspace of C k {£l). 



Let V be a linear space. Elements x±, x-i, . . . , x n ofV are linearly dependent if 
there are scalars ai, . . . , a n (not all zero) such that 

a 1 x 1 H h a n x n = 0. 

If there is no such set of scalars, then they are linearly independent. 
The linear span of the vectors xi, x 2 , ■ ■ ■ , x n is the linear subspace 



Definition 1.3. If the linear space V is equal to the span of a linearly inde- 
pendent set of n vectors, then V is said to have dimension n. If there is no such 
set of vectors, then V is infinite -dimensional. 

A linearly independent set of vectors that spans V is called a basis for V. 

Example 1.3. [1] (cf. Example 1.1(1)) The space R" has dimension n; the 
standard basis is given by the vectors e\ — (1, 0, . . . , 0), e 2 = (0, 1, 0, . . . , 0), . . . , e„ = 



[2] (cf. Example 1.1 [2]) A basis is given by {1, t, t 2 , . . . , t n }, showing the space to 
have dimension (n + 1). 

[3] Examples 1.1 [4], [5], [6], [7], [8] are all infinite-dimensional. 



A norm on a vector space is a way of measuring distance between vectors. 

Definition 1.4. A norm on a linear space V over k is a non-negative function 
|| • || : V — ► M with the properties that 

(1) 1 1 a; 1 1 = if and only if x = (positive definite); 

(2) ||a; + y\\ < \\x\\ + \\y\\ for all x,y e V (triangle inequality); 

(3) || ceo; || = |a|||x|) for all x e V and a e k. 

In Definition 1.4(3) we are assuming that k is R or C and | • | denotes the usual 
absolute value. If || • || is a function with properties (2) and (3) only it is called a 
semi norm. 



2. Linear subspaces 



3. Linear independence 




(0,...,0,1). 
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Definition 1.5. A normed linear space is a linear space V with a norm || • | 
(sometimes we write || • \\v)- 

Definition 1.6. A set C in a linear space is convex if for any two points 
x,y eC, tx+{l-t)y e C for all t G [0,1]. 

Definition 1.7. A norm ||-|| is strictly convex if || x \\ = 1, ||y|| = 1, ||x + y|| = 2 
together imply that x = y. 

We won't be using convexity methods much, but for each of the examples try to 
work out whether or not the norm is strictly convex. Strict convexity is automatic 
for Hilbert spaces. 

Example 1.4. [1] Let V = R n with the usual Euclidean norm 

1/2 

in = ni2= (en ' 

To check this is a norm the only difficulty is the triangle inequality: for this we use 
the Cauchy-Schwartz inequality 

[2] There are many other norms on W 1 , called the p-norms. For 1 < p < oo defined 

i/p 

\\ x \\p = ( Ei^' 



»i\ p 



Then || • || p is a norm on V: to check the triangle inequality use Minkowski's 
Inequality 




i/p 



There is another norm corresponding to p = oo, 

IMloo = max {|^|}. 

l<j<n 

It is conventional to write £™ for these spaces. Notice that the linear spaces £™ and 
£g have exactly the same elements. 

[3] Let X = loo be the linear space of bounded infinite sequences (cf. Example 
1.1 [4]). Consider the function 




If we restrict attention to the linear subspace on which || • is finite, then || • || p is a 
norm (to check this use the infinite version of Minkowski's inequality). This gives 
an infinite family of normed linear spaces, 

£ p = {x = (xi,x 2 , . . . ) | ||a;||p < oo}. 

Notice that for p < oo there is a strict inclusion £ p C too- Indeed, for any p < q 
there is a strict inclusion £ p C £ q so £ p is a linear subspace of £ q . That is, the sets 
£ p and £ q for p ^ q do not contain the same elements. 
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[4] Let X = C[a,b], and put ||/|| = sup 4e [ a ^ \ f(t)\. This is called the uniform or 
supremum norm. Why is is finite? 

[5] Let X = C{a,b], and choose 1 < p < oo. Then (using the integral form of 
Minkowski's inequality) we have the p-norm 




[6] (cf. Example 1.1 [6]). Let V be the set of Riemann-integrable functions / : 
(0, 1) — > R which are square-integrable. Let ||/|| 2 = f$ \f(x)\ 2 dx < oo. Then V is 
a normed linear space. 

5. Isomorphism of normed linear spaces 

Recall form linear algebra that linear spaces V and W are (algebraically) iso- 
morphic if there is a bijection T : V W that is linear: 

T(ax + f3y) = aT(x) + (3T{y) 

for all a, € k and x, y £ V. 

A pair (X, || • \\ x ), (Y,\\ ■ \\y) of normed linear spaces are (topologically) iso- 
morphic if there is a linear bijection T : X — > Y with the property that there are 
positive constants a, b with 

a\\x\\ x <\\T(x)\\ Y <b\\x\\ x . (1) 

We shall usually denote topological isomorphism by X = Y. 

Lemma 1.1. If X and Y are n-dimensional normed linear spaces over R (or 
C ) then X and Y are topologically isomorphic. 

If the constants a and b in equation (1) may both be taken as 1, so ||T(x)||y = 
1 1 x 1 1 x, then T is called an isometry and the normed spaces X and Y are called 
isometric. 

Example 1.5. The real linear spaces (C, | • |) and (M 2 , j| • || 2 ) are isometric. 

If Y is a subspace of a linear normed space (A", || • \\x) then || • \\ x restricted to 
Y makes Y into a normed subspace. 

Example 1.6. Let Y denote the space of infinite real sequences with only 
finitely many non-zero terms. Then Y is a linear subspace of t v for any 1 < p < oo 
so the p-norm makes Y into a normed space. 

6. Products of normed spaces 

If (X, || • || x) and (Y,\\ ■ \\y) are normed linear spaces, then the product 

X xY = {(x,y)\xeX,yeY} 

is a linear space which may be made into a normed space in many different ways, 
a few of which follow. 

Example 1.7. [1] \\{x, y)\\ = (\\x\\ x + \\y\\y) 1/p ; 
[2] ||(x,y)||=ma X {||x||x,|M|y}. 
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7. Continuous maps between normed spaces 

We have seen continuous maps between R and R in first year analysis. To make 
this definition we used the distance function \x — y\ on R: a function / : R — > R is 
continuous if 

VoeM, V e > 0, 3 5 > such that \x — a\ < 5 => \f(x)-f(a)\<e. 

(2) 

Looking at (2), we see that exactly the same definition can be made for maps be- 
tween linear normed spaces, which in view of Example 1.4 will give us the possibility 
of talking about continuous maps between spaces of functions. Thus, on suitably 
defined spaces, questions like "is the map / i— > f continuous?" or "is the map 
/ i ► Jq /" continuous?" can be asked. 

Definition 1.8. A map / : X — > Y between normed linear spaces (X, \\ ■ \\ x ) 
and (Y, \\ ■ \\y) is continuous at a £ X if 

V e > 3 5 = 5{e,a) > such that \\x - a\\ x < 5 => \\f(x)-f(a)\\ Y <e. 

If / is continuous at every a £ X then we simply say / is continuous. 
Finally, / is uniformly continuous if 

V e > 3 S = <5(e) > such that \\x-y\\ x < 5 => \\f(x)-f(y)\\ Y <eVx,y€X. 

Example 1.8. [1] The map x i— ► x 2 from (R, | • |) to itself is continuous but not 
uniformly continuous. 

[2] Let f(x) = Ax be the non-trivial linear map from R n to R m (with Euclidean 
norms) defined by the m x n matrix A = (a^). Using the Cauchy-Schwartz in- 
equality, we see that / is uniformly continuous: fix a £ R n and b = Aa. Then for 
any x £ R n we have 

m n 

\\Ax-Aa\\ 2 = X X"<< ,: - r < 

»=i j=i 

m I n \ / n 

^ E Ei4i Efe-«i 

»=1 \j=l / \J=1 

= C 2 ||x-a|| 2 

where C 2 — X}j=i l a ij| 2 > 0- ^ follows that / is uniformly continuous, and 

we may take 5 — e/C . 

[3] Let X be the space of continuous functions [— 1, 1] — > M with the sup norm (cf. 
Example l-4[4]). Define a map F : X — > X by F(u) = w, where 

u(t) = 1+1 (sin m(s) + tan s) ds. 
Jo 

The map F is uniformly continuous on X. Notice that F is intimately connected 
to a certain differential equation: a fixed point for F (that is, an element u £ X 
for which F(u) = u) is a continuous solution to the ordinary differential equation 

du 

— = sin(u) + tan(t); u(0) = 1, 
dt 

in the region t £ [—1,1]. We shall see later that F does indeed have a fixed point 
knowing that F is uniformly continuous is a step towards this. To see that F is 
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continuous, calculate 



\\F(u) - F(v)\\ = sup \F(u)(t) - F(v)(t)\ 
te[-i,i] 



< 



sup 

t€[-l,l] 



sup 

te[-i,i] 



sup 

t€[-l,l] 



1 



(sinu(s) + tan s)ds 



1 + 



/ (sin v 
Jo 



(s) + tan s)ds 



/ (sinu(s) — sinw(s)) ds 
Jo 

/ | sinu(s) — sinw(s 
Jo 



< \\u — v\\. 

Notice we have used the inequality | sinu — sin v| < \u — v\, an easy consequence of 
the Mean Value Theorem. 

[4] Let X be the space of complex-valued square-integrable Riemman integrable 
functions on [0, 1] with 2-norm (cf. Example 1.4[6]). Define a map F : X — > X by 
F(u) = v, with 

v(t) = [ u 2 (s)ds. 
Jo 

Then F is continuous (but not uniformly continuous): 
/■t 

\Fu(t) - Fv(t)\ = 



< 



< 



(u 2 (s)-v 2 (s))ds\ 

[\\u(s)\ + \v(s)\)(\u(s)\-\v(s)\)d. 
Jo 

(\u( S )\ 2 + \v( S )\ 2 )d.s 



1 x 1/2 

(\u( S )-v( S )\ 2 )d S ) . 



so that 



\\Fu-Fv\\ 2 < sup \u{t)-v{t)\ < (\\\u\ + \v\\\ 2 )\\u-v\\. 
te[o,i] 



[5] The same map as in [4] applied to square-integrable Riemann integrable func- 
tions on [0, oo ) is not continuous. To see this, let a, b £ R and define 

!a, < t < 2b 2 
ia, 2b 2 < Ab 2 
otherwise. 



Then ||u - 0|| 2 = 2ab. On the other hand, 

' a 2 t, {)<t<2b 2 
4b 2 a 2 -a\ 2b 2 <t<4b 2 



F(u)(t) 



0. 



otherwise. 



Then — F(0)||2 = ^§a 4 b 6 . Now, given any S > we may choose constants 

a, b with 2ab < S but ^a 4 b 6 = 1. That is, given any 5 > there is a function u 
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with the property that \\u — 0|| < 6 but — ^(0)|| = 1) showing that F is not 

continuous. 

The moral is that the topological properties of infinite spaces are a little 
counter-intuitive. 

8. Sequences and completeness in normed spaces 

Just as for continuity, we can use the norm on a normed linear space to define 
convergence for sequences and series in a normed space using the corresponding 
notion for R. 

Let X = (X, || • \\x) be a normed linear space. A sequence (x n ) in X is said to 
converge to a e X if 

\\x n — a\\ — > 

as n — > co. 

Similarly, a series X^^Li x n converges if the sequence of partial sums (sjv) 
defined by sn = J2n=i x n ^ s a convergent sequence. 

Example 1.9. [1] If (xj) is a sequence in R™, with Xj — (x^\ . . . , x^), then 
check that 

1 1 X j 1 1 p ^ 

(that is, (xj) converges to in the space £") if and only if —> in R for each 
k = 1, . . . , n. 

[2] For infinite-dimensional spaces, it is not enough to check convergence on each 
component using a basis. Let (xj) be the sequence in l v defined by 

Xj = (0,0,..., 1,...) 

(where the 1 appears in the jth position. Then if we write xj = {x^\x^\ . . . ) we 

certainly have x^ — > as j — > oo for each k. However, we also have \\xj\\ p = 1 for 
all j, so the sequence is certainly not converging to 0. Indeed, it is not converging 
to anything. 

Lemma 1.2. A map F : X — > Y between normed linear spaces is continuous at 
a e X if and only if 

lim F(x n ) = F(a) 

n — >oo 

for every sequence (x n ) converging to a. 

Proof. Replace | • | with || • || in the proof of this statement for functions 
R -» R. □ 

Definition 1.9. A sequence (x n ) is a Cauchy sequence if 

V e > 3 JV such that n,m> N \\x n — x m \\ < e. 

It is clear that a convergent sequence is a Cauchy sequence. We know that in 
the normed linear space (R, | • |) the converse also holds, and it is a simple matter 
to check that in R™ the converse holds. In many reasonable infinite-dimensional 
normed linear spaces however there are Cauchy sequences that do not converge. 

Definition 1.10. A normed linear space is said to be complete if all Cauchy 
sequences are convergent. 
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Example 1.10. [1] The sequence 3, j^, fggg§, ■ • ■ is a Cauchy sequence 
of rationals converging to the real number tt. 

[2] Consider the space C[0, 1] of continuous functions under the sup norm (cf. Ex- 
ample 1.4[4]). This is complete. 

[3] The space C[0, 1] under the 2-norm (cf. Example 1.4[5]) is not complete. To 
see this, consider the sequence of functions 



u n (t) 



0> 0<t<i-i 

nt n I 1 1 J_ <" f <T — 

2 4 f 2' 2 n — 1 — 2 

1 § + 



Then (u n ) is a Cauchy sequence, since 



l 

2 - ' 'ii m (t) -u n (t)\ 2 dt 
o 



2 



r-1/2 rl/2+l/m 

/ \u m {t) -u n {t)\ 2 dt+ / |u m (i) - u n {t)\ 2 dt 

Jl/2-l/m Jl/2 



'1/2-1/m Jl/2 

— > as to > n — > oo. 

We claim that the sequence (w n ) is not convergent in C[0, 1] under the 2-norm. To 
see this, let g be the function defined by g(t) = for < t < | and <?(i) = 1 for 
2 < t < 1, and assume that there is a continuous function / with — /|| 2 — > 
as n — > oo. It is clear that ||m„ — .g||2 — > as n — > oo also, so we must have 

II/- J/112 = 0. (3) 

Now examine /(|). If /(|) 7^ 5(5) = then \f — g\ must be positive on (| — (5, |) 
for some 6 > 0, which contradicts (3). We must therefore have /(|) = 0; but in 
this case \f — g\ must be positive on (|, | + 5) for some (5 > 0, again contradicting 
(3). We conclude that there is no continuous function / that is the 2 norm limit 
of the sequence (u n ). Thus the normed space (C[0, 1], || • || 2 ) is not complete. 

9. Topological language 

There are certain properties of subsets of normed linear spaces (and other 
more general spaces) that we use very often. Topology is a subject that begins 
by attaching names to these properties and then develops a shorthand for talking 
about such things. 

Definition 1.11. Let X be a normed linear space. 

A set C C X is closed if whenever (c„) is a sequence in C that is a convergent 
sequence in X, the limit limy^oo c„ also lies in C. 

A set U C X is open if for every u e U there exists e > such that ||x — u\\ < 
e =>• set/. 

A set 5 C A is bounded if there is an i? < 00 with the property that x G 
5 < i?. 

A set S C A is connected if there do not exist open sets A, B in X with 
5cAUB,SnA^Mn5^flandSnAn5 = 0. 

Associated to any set S C A in a normed space there are sets 5° C S C S 
defined as follows: the interior of S is the set 

S° = {xeX\3e>0 such that \\x - y\\ < e => y e S}, (4) 



14 



1. NORMED LINEAR. SPACES 



and the closure of S, 

S = {i e 1 | V e > 3s G S such that \\s - x\\ < e}. (5) 

Exercise 1.1. [1] Prove that a map / : X — > Y is continuous (cf. Definition 
1.8) if and only if for every open set U C Y, the pre-image C X is also 

open. 

[2] Show by example that for a continuous map / : R — ► R there may be open sets 
U for which /(£/) is not open. 

It is clear from first year analysis that closed bounded sets (closed intervals, 
for example) have special properties. For example, recall the theorem of Bolzano- 
Weierstrass. 

Theorem 1.1. Let S be a closed and bounded subset o/R. Then a continuous 
function f : S — > R attains its bounds: there exist £, n £ S with the property that 

f(0 = sup f(s), f( V )= inf/(a). 

Definition 1.12. A subset 5 of a normed linear space is (sequentially) com- 
pact if and only if every sequence (s„) in 5 has a subsequence (s nj ) = (s m , s„ 2 , . . . ) 
that converges in S. 

Recall the following theorem (the Heine-Borel theorem) - which is really the 
same one as Theorem 1.1. 

Theorem 1.2. A subset o/R™ is compact if and only if it is closed and bounded. 

By now you should be used to the idea that any such result does not extend to 
infinite-dimensional normed linear spaces: Example 1.9 [2] is a bounded sequence 
with no convergent subsequences. Thus the result Theorem 1.2 does not extend 
to infinite-dimensional normed spaces. However the analogue of Theorem 1.1 does 
hold in great generality. This is also a version of the Bolzano- Weierstrass theorem. 

Theorem 1.3. If A is a compact subset of a normed linear space X, and f : 
X —>Y is a continuous map between normed linear spaces, then f{A) is a compact 
subset ofY. 

As an exercise, convince yourself that Theorem 1.3 implies Theorem 1.1. 
Some standard sets are used so often that we give them special names. 

Definition 1.13. Let X be a normed space. Then the open ball of radius 
e > and centre xo is the set 

B e (x ) = {x e X | \\x-x \\ < e. 

The closed ball of radius e > and center xq is the set 

B e (x Q ) = {x£X\ \\x-x \\ < e}. 

Exercise 1.2. Open and closed balls in normed spaces are convex (cf. Defini- 
tion 1.6). 

Definition 1.14. A subset S of a normed space X is dense if every open ball 
in X has non-empty intersection with S. A normed space is said to be separable 
if there is a countable set S = {xi,X2, ■ ■ ■} that is dense in X. 
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10. Quotient spaces 

As an application of Section 9, quotients of normed spaces may be formed. 
Notice that we need both the algebraic structure (subspacc of a linear space) and 
a topological property (closed) to make it all work. 

Recall from Definition 1.11 and Definition 1.2 that a closed linear subspace Y 
of a normed linear space X is a subset Y <Z X that is itself a linear space, with the 
property that any sequence (y n ) of elements of Y that converges in X has the limit 
in Y. 

The linear space X/Y (the quotient or factor space is formed as follows. The 
elements of X/Y are cosets of Y - sets of the form x + Y for x € X. The set of 
cosets is a linear space under the operations 

(an + Y) © (x 2 + Y) = ( Xl + x 2 ) + Y, X • (x + Y) = Xx + Y. 

Notice that this makes sense precisely because Y is itself a linear space, so for 
example Y + Y = Y and XY = Y for A ^ 0. Two cosets X\ + Y and x 2 + Y are 
equal if as sets x\ + Y = x 2 + Y, which is true if and only if x\ + x 2 G Y. 

Example 1.11. [1] Let X = R 3 , and let Y be the subspace spanned by (1, 1, 0). 
Then X/Y is a two-dimensional real vector space. There are many pairs of elements 
that generate X/Y, for example 

(1,0,1) + F and (0,0, l) + Y. 

[2] The linear space Y of finitely supported sequences in l\ is a linear subspace. The 
quotient space l\/Y is very hard to visualize: its elements are equivalence classes 
under the relation (x n ) <~ (y n ) if the sequences (x n ) and (y n ) differ in finitely many 
positions. 

[3] The linear space Y of i\ sequences of the form (0, . . . , 0, x n+ i, . . . ) (first n are 
zero) is a linear subspace of l\. Here the quotient space l\/Y is quite reasonable: 
in fact it is isomorphic to R™. 

[4] We know that for p, q e [l,oo], p < q £ p C l q . This means that for 

any p < q there is a linear quotient space l q /l p . These quotient spaces are very 
pathological. 

[5] The linear space Y = C[0, 1] is a linear subspace of the space X of square- 
Riemmann-integrable functions on [0, 1]. The quotient X/Y is again a linear space 
that is impossible to work with. 

[6] Let X = C[0, 1], and let Y = {/ e X | /(0)}. Then X/Y is isomorphic to E. 

It is clear from these examples that not all linear subspaces are equally good: 
Examples 1.11 [1], [3], and [6] are quite reasonable, whereas [2], [4] and [5] are 
examples of linear spaces unlike any we have seen. The reason is the following: the 
space X/Y is guaranteed to be a normed space with a norm related to the original 
norm on X only when the subspace Y is itself closed. Notice that Examples 1.11 
[1], [3], and [6] are precisely the ones in which the subspace is closed. 

Theorem 1.4. If X is a normed space, and Y is a normed linear subspace, 
then X/Y is a normed space under the norm 

||.T + y||= inf ||z||. (6) 

Before proving this theorem, try to convince yourself that the norm (6) is the 
obvious candidate: if X = R 2 and Y = (1,0)E, then the space X/Y consists of 
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lines in X of the form (s, t) + Y. Notice that each such line may be written uniquely 
in the form (0, t) + Y, and this choice minimizes the norm of the clement of X that 
represents the line. 

Proof. Let x + Y be any coset of X, and let (x n ) c z + Y be a convergent 
sequence with x n — > x. Then for any fixed n, x n — x m — » a;„ — x is a sequence 
in y converging in X. Since Y is closed, we must have x n — x G Y, so x + Y = 
x n + Y = z + Y . That is, the limit of the sequence defines the same coset as does 
the sequence - the set z + Y is a closed set. 

Assume now that ||x + Y\\ =0. Then there is a sequence (x n ) C x + Y with 
||a: n || — » 0. Since x + F is closed and x„ — > 0, we must have G x + F, so x + F = Y, 
the zero element in X/Y. 

Homogeneity is clear: 

||A(z + y)||= inf ||Az|| = |A| inf \\z\\ = \X\\\x + Y\\. 

zex+Y z£x+Y 

Finally, the triangle inequality: 

\\( X1 + Y) + (X2 + Y)\\ = inf \\z 1 + z 2 \\ 

zi£x 1 + Y;z 2 £x2 + Y 

< inf ||zi|j + inf ||z2|| 

ziexi+Y z 2 £x 2 +Y 



\\x-l + Y\\ + \\x 2 + Y\\ 



□ 



Example 1.12. Even if the subspace is closed, the quotient space may be a 
little odd. For example, let c denote the space of all sequences (x n ) with the 
property that lim„ x n exists. This is a closed subspace of £oo. What is the quotient 

-^oo/ c ? 



CHAPTER 2 



Banach spaces 

It turns out to be very important and natural to work in complete spaces - 
trying to do functional analysis in non-complete spaces is a little like trying to do 
elementary analysis over the rationals. 

Definition 2.1. A complete normed linear space is called 1 a Banach space. 

Example 2.1. [1] We are already familiar with a large class of Banach spaces: 
any finite-dimensional normed linear space is a Banach space. In our notation, this 
means that £™ is a Banach space for all 1 < p < oo and all n. 
[2] The space of continuous functions with the sup norm is a Banach space (cf. 
Example 1.4 [4] and Example 1.10 [2]. 

[3] The sequence space £ p is a Banach space. To see this, assume that (x n ) is a 
Cauchy sequence in £ p , and write 

r - ( T (l) T (2) \ 

Recall that || • || p > || • for all p (cf. Example 1.4 [3]). So, given e > we may 
find N with the property that 

m,n> N \\x n - x m \\ p < e 

which in turn implies that \\x n — £ TO ||oo < e, so for each fc, \x^ — x$\ < e. That 
is, if (x n ) is a Cauchy sequence in £ p , then for each fc (x^) is a Cauchy sequence 
in R. Since R is complete, we deduce that for each fc we have x^ — > y^ k \ Notice 
that this does not imply by itself that x n — > y (cf. Example 1.9 [2]). However, if we 
know (as we do) that (x n ) is Cauchy, then it does: we prove this for p < oo but 
the p = oo case is similar. Fix e > 0, and use the Cauchy criterion to find N such 
that n,m > N implies that 

oo 

fe=l 

Now fix n and let m — > oo to see that 

oo 

5>( fe) -y (fe) r<e 

fc=l 

(notice that < has become <). This last inequality means that 

K-2/||p<e 1/p , 

1 After the Polish mathematician Stefan Banach (1892—1945) who gave the first abstract 
treatment of complete normed spaces in his 1920 thesis (Fundamcnta Math., 3, 133—181, 1922). 
His later book (Theorie des operations lineaires, Warsaw, 1932) laid the foundations of functional 
analysis. 
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showing that in £ p , x n -> y = (y {1 \yf\ ...). 

Lemma 2.1. Le£ (x„) &e a sequence in a Banach space. If the series Y^=i Xn 
is absolutely convergent, then it is convergent. 

Recall that absolutely convergent means that the numerical series Y^=i \\ x n\\ 
is convergent. The lemma is clearly not true for general normed spaces: take, for 
example, a sequence of functions in C[0, 1] each with ||/ n ||2 = ^2 with the property 
that Y^Li fn is n °t continuous. 

PROOF. Consider the sequence of partial sums s m = J2™=i x n- 

m 

||*m-*fc|| < H^IH 
n=k+l 

as m > k — > 00. It follows that the sequence (s m ) is Cauchy; since X is complete 
this sequence converges, so the series Y^?=i x n converges. □ 

1. Completions 

Completeness is so important that in many applications we deal with non- 
complete normed spaces by completing them. This is analogous to the process of 
passing from Q to R by working with Cauchy sequences of rationals. In this section 
we simply outline what is done. In later sections we will see more details about 
what the completions look like. 

Let X be a normed linear space. Let C(X) denote the set of all Cauchy se- 
quences in X. An clement of C(X) is then a Cauchy sequence (x n ). The linear 
space structure of X extends to C(X) by defining a ■ (x n ) + (y n ) — (ax n + y n ). The 
norm || • || on X extends to a semi-norm on C(X), defined by 

||(x„)|| = lim ||a;„||. 

n — »oo 

Finally, define an equivalence relation ~ on C(X) by (x n ) ~ (y n ) if and only if 
x n — yn —> 0. Then the linear space operations and the semi-norm are well-defined 
on the space of equivalence classes C(X)/ ~, giving a complete normed linear space 
X called the completion of X. 

Exercise 2.1. [1] Apply the process outlined above to the rationals Q. Try to 
see why the obvious extension of the norm to the space of Cauchy sequences only 
gives a semi-norm. 

[2] Construct a Cauchy sequence (/„) in (C[0, 1],|| • H2) with the property that 
/„ 7^ for any n but ||/ n ||2 — » 0. This means that the Cauchy sequence (/„) and 
the Cauchy sequence (0) are not separated by the semi-norm || • || 2 , showing it is 
not a norm. 

[3] Show that if X is already a Banach space, then there is a bijective isomorphism 
between X and X. 

It should be clear from the above that it is going to be difficult to work with 
elements of the completions in this formal way, where an element of X is an equiv- 
alence class of Cauchy sequences. However all we will ever need is the simple 
statement: for any normed linear space X, there is a Banach space X such that X 
is isomorphic to a dense subspace i(X) of X; the map 1 from X into X preserves 
all the linear space operations. 



2. CONTRACTION MAPPING THEOREM 



19 



Example 2.2. [1] We have seen that C[0, 1] under the 2-norm is not complete 
(cf. Example 1.10 [2]). Similar examples will show that C[0, 1] is not complete 
under any of the p-norms. Let X denote the non-complete space (C[0, 1], | • || p ). 
A reasonable guess for X might be the space of Ricmann-integrable functions with 
finite p-norm, but this is still not complete. It is easy to construct a Cauchy 
sequence of Ricmann intcgrablc functions that does not converge to a Ricmann- 
integrable function in the p-norm. However, if you use Lebesgue integration, you 
do get a complete space, called L p [0, 1]. For now, think of this space as consisting of 
all Riemann-integrablc functions with finite p-norm together with extra functions 
obtained as limits of sequences of Riemann-integrable functions. Then L p provides 
a further example of a Banach space. 

[2] A function / : X — » Y is said to have compact support if it is zero outside 
some compact subset of X; the support of / is the smallest closed set containing 
{x G X | f(x) 0}. This example is of importance in distribution theory and 
the study of partial differential equations. Let C^°(f2) be the space of infinitely 
differentiable functions of compact support on Q, an open subset of R n . Recall the 
definition of higher-order derivatives D a from Example 1.1(8). For each k G N and 
1 < p < oo define a norm 



This gives an infinite family of (different) normed space structures on the linear 
space Cq°(SY). None of these spaces are complete because there are sequences of 
C°° functions whose (n,p)-limit is not even continuous. The completions of these 
spaces are the Sobolev spaces. 



In this section we prove the simplest of the many fixed-point theorems. Such 
theorems are useful for solving equations, and with the formalism of function spaces 
one uniform treatment may be given for numerical equations like x = cos(x) and 
differential equations like = x + tan(a;y), y(0) = yo. 

Exercise 2.2. If you have an electronic calculator, put it in "radians" mode. 
Starting with any initial value, press the cos button repeatedly. What happens? 
Can you explain why this happens? (Draw a graph) How does this relate to the 
equation x — cos(x). 

Definition 2.2. A map F : X — > Y between normed linear spaces is called a 
contraction if there is a constant K < 1 for which 



for all x, y G X. 

Exercise 2.3. [1] Any contraction is uniformly continuous. 
[2] If / : [a, b] — > [a, b] has the property that \f(x) — f(y)\ < \x — y\ then / is a 
contraction. 

[3] Find an example of a function / : R — > R that has the property \f(x) — f(y)\ < 
\x — y\ for all x, y G R, but / is not a contraction. 



Il/I 




2. Contraction mapping theorem 



\\F(x) - F(y)\\ Y < K ■ \\x - y\\ x 



(7) 
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Theorem 2.1. // F : X — > X is a contraction and X is a Banach space, then 
there is a unique point x* G X which is fixed by F (that is, F{x) — x). Moreover, 
if xq is any point in X, then the sequence defined by x\ — F(xq),X2 — F(xi), . . . 
converges to x* . 

Corollary 2.1. If S is a closed subset of the Banach space X , and F : S — > S 
is a contraction, then F has a unique fixed point in S. 

Proof. Simply notice that S is itself complete (since it is a closed subset of 
a complete space), and the proof of Theorem 2.1 does not use the linear space 
structure of X. □ 

Corollary 2.2. If S is a closed subset of a Banach space, and F : S —* S has 
the property that for some n there is a K < 1 such that 

\\F n (x)-F n (y)\\ Y <K-\\x-y\\ x 

for all x,y € S, then F has a unique fixed point. 

Proof. Choose any point x n e S. Then by Corollary 2.1 we have 

x = lim F kn x , 

k — >oo 

where x is the unique fixed point of F n . By the continuity of F, 

Fx = lim FF kn x . 

/c^oo 

On the other hand, F n is a contraction, so 

\\F kn Fx -F kn x \\ < K\\F^-^ n Fxo,F^-^ n x \\ <■■■ < K k \\F(x ) - x \\, 

so 

\\F(x) - x\\ = lim \\FF kn x - F kn x \\ = 0. 

k — >oo 

It follows that F(x) = x so x is a fixed point for F. This fixed point is automatically 
unique: if F has more than one fixed point, then so does F n which is impossible 
by Corollary 2.1. □ 

Exercise 2.4. [1] Give an example of a map / : E — > M which has the property 
that \f(x) — f(y)\ < \x — y\ for all x, y € E but / has no fixed point. 

[2] Let / be a function from [0, 1] to [0, 1]. Check that the contraction condition 
(7) holds if / has a continuous derivative /' on [0, 1] with the property that 

\f'(x)\ < K < 1 

for all x € [0, 1]. As an exercise, draw graphs to illustrate convergence of the iterates 
2 of / to a fixed point for examples with < f'(x) < 1 and —1 < f'(x) < 0. 

Example 2.3. A basic linear problem is the following: let F : E" — > E™ be the 
affine map defined by 

F(x) = Ax + b 



2 Iteration of continuous functions on the interval may be used to illustrate many of the fea- 
tures of dynamical systems, including frequency locking, sensitive dependence on initial conditions, 
period doubling, the Fcigcnbaum phenomena and so on. An excellent starting point is the article 
and demonstration "One-dimensional iteration" at the web site http : //www . geom . umn . edu/j ava/. 
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where A = (a^) is an n x n matrix. Equivalently, F(x) = y, where 



y% 



for i = 1, . . . , n. If F is a contraction, then we can apply Theorem 2.1 to solve 3 the 
equation F(x) = x. The conditions under which F is a contraction depend on the 
choice of norm for R™. Three examples follow. 
[1] Using the max norm, Hxjloo = max{|a:j|}. In this case, 



||F(x)-F(x) 



max 



a ij ( x j x j ) 



< max \a,ij \\xj — Xj 



< max 



ax E I 



max 



El a '. 



x-x 



Thus the contraction condition is 

|fljj| < K < 1 for i=l,...,n. 

3 

[2] Using the 1-norm, ||x||i = X)"=i l^*!- I n this case, 



||F(x)-F(x)|| 1 = £ 



j 

- EE'^'I^' 

* 3 

< ^max^laijl^ ||x-x||i. 



The contraction condition is now 

J2\ a v\ < K < 1 for 3 = !>••• 



n.. 



1 /2 

[3] Using the 2-norm, ||x|| 2 = (SiLiil 2 -*! 2 ) ■ ^ n this case, 
||F(x)-F(x)||* = J2\J2 a ^-^ 



(8) 



(9) 



< 



EE- 

* 3 



,~,l|2 



3 Of course the equation is in one sense trivial. However, it is sometimes of importance 
computationally to avoid inverting matrices, and more importantly to have an iterative scheme 
that converges to a solution in some predictable fashion. 
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The contraction condition is now 

EEi4i<^< 1 - ( 10 ) 

* 3 

It follows that if any one of the conditions (8), (9), or (10) holds, then there 
exists a unique solution in R™ to the affine equation Ax + b = x. Moreover, 
the solution may be approximated using the iterative scheme xi = F(x ),x 2 = 
F( Xl ),.... 

Notice that each of the conditions (8), (9), (10) is sufficient for the method to 
work, but none of them are necessary. In fact there are examples in which exactly 
one of the condition holds for each of them conditions. 

It remains only to prove the contraction mapping theorem. 
PROOF of Theorem 2.1. Given any point x e X, define a sequence (x n ) by 
x\ = F(x ),x 2 = F(x 1 ),.... 
Then, for any n < to we have by the contraction condition (7) 
\\x n -x m \\ = \\F n x - F m x*\\ 

< K n \\x - F m - n x \\ 

< K n (\\X - Xi\\ + \\Xx - X 2 \\ H h \\Xm-n-! - X m - n \\) 

< K n \\x -x 1 \\(l + K + K 2 + --- + K rn - n - 1 ) 

Now for fixed xo, the last expression converges to zero as n goes to infinity so (cf. 
Definition 1.9) the sequence (x n ) is a Cauchy sequence. 

Since the linear space X is complete (cf. Definition 1.10), the sequence x n 
therefore converges, say 

x* = lim Xn. 

x^oo 

Since F is continuous, 

F(x*) = F ( lim x n ) = lim F(x n ) = lim x n+ i = x* , 

so F has a fixed point x*. To prove that x* is the only fixed point for F, notice 
that if F(y) — y say, then 

Ha:* - = ||F( a: *)-F(y)||<A'|| a; *- tfll, 

which requires that x* = y since K < 1. □ 

3. Applications to differential equations 

As mentioned before, the most important applications of the contraction map- 
ping method are to function spaces. We have seen already in Example 1.8 [3] that 
fixed points for certain integral operators on function spaces are solutions of ordi- 
nary differential equations. The first result in this direction is due to Picard 4 . 



4 (Charlcs) Emilc Picard (1856-1941), who was Professor of higher analysis at the Sorbonnc 
and became permanent secretary of the Paris Academy of Sciences. Some of his deepest results 
lie in complex analysis: 1) a non-constant entire function can omit at most one finite value, 2) 
a non-polynomial entire function takes on every value (except the possible exceptional one), an 
infinite number of times. 
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Theorem 2.2. Let f : G —> R be a continuous function defined on a set G 
containing a neighbourhood {(x,y) | ||(a;, y) — (xo, yo) ||oo < e} of (xo,yo) for some 
e > 0. Suppose that f satisfies a Lipschitz condition of the form 

\f(x,y)-f(x,y)\<M\y-y\ (11) 

in the variable y on G. Then there is an interval (xq — S, Xq + S) on which the 
ordinary differential equation 

% = f^y) (12) 

has a unique solution y = 4>{ x ) satisfying the initial condition 

<P( x o)=yo- (13) 

PROOF. The differential equation (12) with initial condition (13) is equivalent 
to the integral equation 



4>{x) = Vo 



f(t,Ht))dt. 



(14) 



Since / is continuous there is a bound 

\f(x,y)\<R (15) 

for all (x, y) with |j(a;, y) — (xq, 2/o)||oo < e' for some e' > 0. Choose 5 > such that 

(1) \x-x a \ <5, \y-yo\ < RS together imply that \\(x,y) - (x , yo) ||oo < e'; 

(2) MS < 1 where M is the Lipschitz constant in (11). 

Let S be the set of continuous functions <fi defined on the interval | x — x \ < S 
with the property that \<j){x) — yo\ < RS, equipped with the sup metric. The set 
S is complete, since it is a closed subset of a complete space. Define a mapping 
F : S — > S by the equation 



(F(<t>))(x)=yo+ f f(t,cf>(t))dt. 

J Xq 

First check that F does indeed map S into S: if G S, then 

= f f(t,4>(t))dt 

J Xq 

< f \f(t,4>(t))\dt 

J Xq 

< R\x-x \<R6 



(16) 



\F4>(x)-y \ 



by (15), so F{4>) e S. Moreover, 

\Fcp(x) - F4>(x)\ < 



\f(t,<Kt))-f(t,4>(t))\dt 



< MSmax\<f)(x) - (j){x)\, 



so that 



||F(0)-F^)||<M(5||0-^||, 

after taking sups over x. By construction, MS < 1, so that F is a contraction 
mapping. It follows from Corollary 2.2 that the operator F has a unique fixed 
point in S, so the differential equation (12) with initial condition (13) has a unique 
solution. □ 
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The conditions on the set G used in Theorem 2.2 arise very often so it is 
useful to have a short description for them. A domain in a normcd linear space 
X is an open connected set (cf. Definition 1.11). An example of a domain in M 
containing the point a is an interval (a — 5, a + S) for some S > 0. Notice that 
if G is a domain in (X, || • ||x), and a e G then for some e > the open ball 
B e (a) = {x e X | Hz - a\\ x < e} lies in G (cf. Definition 1.13). 

Picard's theorem easily generalises to systems of simultaneous differential equa- 
tions, and we shall see in the next section that the contraction mapping method 
also applies to certain integral equations. 

Theorem 2.3. Let G be a domain in M. n+1 containing the point (xq, yoi, . . . , yo n ), 
and let fi, . . . , f n be continuous functions from G to R each satisfying a Lipschitz 
condition 

\fi{x,yi, ... ,2/„) - fi{x,yi,.. .,y n )\ < M max \ Vl - (17) 

l<t<n 

in the variables y\, . . . , y n . Then there is an interval (xo — 8, x n + 8) on which the 
system of simultaneous ordinary differential equations 

= fi{x,yi,...,y n ) for i = l,...,n (18) 

has a unique solution 

yi = (j>i(x), ...,y n = <p n (x) 
satisfying the initial conditions 

</>l(xo) = y01,---,</>n(xo) = yOn- (19) 

PROOF. As in the proof of Theorem 2.2, write the system defined by (18) and 
(19) in integral form 

M x ) = Voi+ / fi{t,<t>\{t),...,<t> n {t))dt for i = l,...,n. (20) 

J Xq 

Since each of the functions fi is continuous on G, there is a bound 

\fi(x, yi ,...,y n )\<R (21) 

in some domain G' C G with G' 3 (x , y m , . . . , yo n )- Choose 8 > with the 
properties that 

(1) \x — xo\ < 8 and maxi \yi — y M \ < RS together imply that (x, y\, . . . , y n ) G G'; 

(2) MS < 1. 

Now define the set S to be the set of n-tuples (<p\, . . . , <f> n ) of continuous func- 
tions defined on the interval [xo — S,x + S] and such that \<f>i(x) — yai\ < R$ f° r all 
i= 1, . . . ,n. The set S may be equipped with the norm 

= max\^i(x) - <f>i(x)\. 

It is easy to check that S is complete. The mapping F defined by the set of integral 
operators 

{F((j))) i (x) = y i + / fi(t,<f>i(t), . . . ,<j> n (t))dt for \x - x \ < S, i = 1, . . . , n 

J x 

is a contraction from S to itself. To see this, first notice that if 
<p = (0i, • • ■ , <f> n ) € S, and \x — x \ < 5 
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then 



i(x) - y i 



f h{t,cj>i{t),...,(j> n {t))dt 

J Xn 



< RS for i = 1 , . . . , n 



by (21), so that F((f>) — (F((f>)i, . . . , F(<f)) n ) lies in S. It remains to check that F is 
a contraction: 

I i F {4>))i (x) - (f{4>)) (x)\ < [ X \f i (t,Mt),---,Mt))-fi(t,Mt),---,4> n (t))\dt 

V 1 i Jx a 

< MJmax \4>i(x) — <j>i(x)\; 

i 

after maximising over x and i we have 

\\F(<f>)-F(0)\\<M5\\<l>-j>l 

so F : S — > S is a contraction. It follows that the equation (20) has a unique 
solution, so the system of differential equations (18) and (19) has a unique solution. 

□ 



4. Applications to integral equations 

Integral equations may be a little less familiar than differential equations (though 
we have seen already that the two are intimately connected) , so we begin with some 
important examples. The theory of integral equations is largely modern (twentieth 
century) mathematics, but several specific instances of integral equations had ap- 
peared earlier. 

Certain problems in physics led to the need to "invert" the integral equation 
g(x) =^= r e™yf{y)dy (22) 

V27T J-oo 

for functions / and g of specific kinds. This was solved - formally at least - by 
Fourier 5 in 1811, who noted that (22) requires that 

fix) = f e- i ^g{y)dy. 

V 27T J-oo 

We shall see later that this is really due to properties of particularly good Banach 
spaces called Hilbcrt spaces. 



5 Jean Baptistc Joseph Fourier (1768—1830), who pursued interests in mathematics and math- 
ematical physics. He became famous for his Theorie analytique de la Chalcur (1822), a mathe- 
matical treatment of the theory of heat. He established the partial differential equation governing 
heat diffusion and solved it by using infinite series of trigonometric functions. Though these series 
had been used before, Fourier investigated them in much greater detail. His research, initially 
criticized for its lack of rigour, was later shown to be valid. It provided the impetus for later work 
on trigonometric series and the theory of functions of a real variable. 
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Abel 6 studied generalizations of the tautochrone 7 problem, and was led to the 
integral equation 

9ix)= I j^Y b dy, be(0,l),g(a)=0 
for which he found the solution 



sinvrfe f y g'(x) 

m = —Ja (y^W^ 



This equation is an example of a Volterra 8 equation. 

We shall briefly study two kinds of integral equation (though the second is 
formally a special case of the first) . 

Example 2.4. A Frcdholm equation 9 is an integral equation of the form 

f(x) = X f K(x,y)f(y)dy + <t>(x), (23) 

J a 

where K and <f> are two given functions, and we seek a solution / in terms of 
the arbitrary (constant) parameter A. The function K is called the kernel of the 
equation, and the equation is called homogeneous if <j> = 0. 

We assume that K(x,y) and <f>(x) are continuous on the square {(x,y) | a < 
x < b,a < y < b}. It follows in particular (see Section 1.9) that there is a bound 
M so that 

\K(x, y)\ < M for all a < x < b, a < y < b. 
Define a mapping F : C[a, b] — > C[a, b] by 

(F(f)) (x) = \f K(x, y)f(y)dy + cj>{x) (24) 

J a 

Now 

\\F{h)-F{h)\\ = m^\F{h){x)-F{f 2 ){x)\ 

X 

< |A|M(6-a)max|/i(aO-/ 2 (aO| 

X 

= \\\M{b-a)\\h-.f 2 \\, 



6 Niels Henrik Abel (1802-1829), was a brilliant Norwegian mathematician. He earned wide 
recognition at the age of 18 with his first paper, in which he proved that the general polynomial 
equation of the fifth degree is insolvable by algebraic procedures (problems of this sort are studied 
in Galois Theory). Abel was instrumental in establishing mathematical analysis on a rigorous 
basis. In his major work, Rcchcrches sur les fonctions clliptiqucs (Investigations on Elliptic Func- 
tions, 1827), he revolutionized the understanding of elliptic functions by studying the inverse of 
these functions. 

7 Also called an isochrone: a curve along which a pendulum takes the same time to make a 
complete osciallation independent of the amplitude of the oscillation. The resulting differential 
equation was solved by James Bernoulli in May 1690, who showed that the result is a cycloid. 

8 Vito Volterra (1860-1940) succeeded Beltrami as professor of Mathematical Physics at 
Rome. His method for solving the equations that carry his name is exactly the one we shall 
use. He worked widely in analysis and integral equations, and helped drive Lebesgue to produce a 
more sophisticated integration by giving an example of a function with bounded derivative whose 
derivative is not Ricmann intcgrable. 

9 This is really a Frcdholm equation "of the second kind", named after the Swedish geometer 
Erik Ivar Fredholm (1866-1927). 
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so that F is a contraction mapping if 

|A< M(b- a y 

It follows by Theorem 2.1 that the equation (23) has a unique continuous solution 
/ for small enough values of A, and the solution may be obtained by starting with 
any continuous function f and then iterating the scheme 

f n +i{x) = X K{x 1 y)f n {y)dy + <t>{x). 

J a 

Example 2.5. Now consider the Volterra equation, 

f(x) = \ [" K(x,y)f(y)dy + <f>(x), (25) 

J a 

which only differs 10 from the Frcdholm equation (23) in that the variable x appears 
as the upper limit of integration. As before, define a function F : C[a, b] — > C[a, b] 

by 

(F(f)) (x) = A f K(x, y)f(y)dy + <j>{x). 

J a 

Then for fi, f 2 6 C[a, b] we have 

inA)(^)-n/2)WI - \ f K{x,y)[h{y) - h{y)\dy 

J a 

< \X\M(x- a) max \f^ x ) - f 2 {x, 

X 

where M — max x . y \K(x,y)\ < oo. It follows that 

l^ 2 (/l)(^) -^ 2 (/2)(^)| = 



A / K{x,y)[F{h){y)-F{f 2 ){y)]dy 

J a 

< \X\M f \F{h){y) - F{f 2 ){y)\dy 

J a 

< \\\ 2 M 2 max\f 1 (x)-f 2 (x)\ f \y - a\dy 

X I 

J a 



12^2 {x-a) 



M ' max|/i(g)-/ 2 (a;)|, 



and in general 



\F n (h)(x)-F n (f 2 )(x)\ < \X\ n M n( ^^m^\h(x)- f 2 (x)\ 

TV. x 

< |ArM" (& ^ ) ^max|/ 1 ( a; )-/ 2 (x)|. 

n! a; 

It follows that 

\\F n h -F n f 2 \\ < \X^M ni ^^\\h -/ 2 ||, 

n! 

so that F™ is a contraction mapping if n is chosen large enough to ensure that 

|A|"M" (6 ~, a) " < 1. 

10 If we extend the definition of the kernel K(x,y) appearing in (25) by setting K(x,y) = 
for all y > x then (25) becomes an instance of the Fredholm equation (23). This is not done 
because the contraction mapping method applied to the Volterra equation directly gives a better 
result in that the condition on A can be avoided. 
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It follows by Corollary 2.2 that the equation (25) has a unique solution for all A. 



CHAPTER 3 



Linear Transformations 



Let X and Y be linear spaces, and T a function from the set C X into Y. 
Sometimes such functions will be called operators, mappings or transformations. 
The set Dt is the domain of T, and T(Dt) C Y is the range of T. If the set Dt is 
a linear subspace of X and T is a linear map, 

T(ax + (3y) = aT{x) + f3T(y) for all a, (3 E R or C, x, y £ X. (26) 

Notice that a linear operator is injective if and only if the kernel {x e X | T.t = 0} 
is trivial. 

Lemma 3.1. A linear transformation T : X — > Y is continuous if and only if 
it is continuous at one point. 

Proof. Assume that T is continuous at a point a. Then for any sequence 
x n — ► a, T(x n ) — > T(a). Let z be any point in A, and j/„ a sequence with y„ — » z. 
Then y„ — z + a is a sequence converging to a, so T(y n — z + a) = T(y n ) — T(z) + 
T(a) -» T(a). It follows that T(y n ) -» T(z). □ 

A simple observation that is useful in differential equations, where it is called 
the principle of superposition: if J2^=i a nX n is convergent, and T is a continuous 
linear map, then T(X^Li a„i„) = a^Xn. 

1. Bounded operators 

Example 3.1. Consider a voltage applied to a resistor _R, capacitor C, 
and inductor L arranged in series (an "LCR" circuit). The charge u = u{t) on the 
capacitor satisfies the equation 

d 2 u n du 1 , 

L ^ + R Tt + c u -^ (27) 

with some initial conditions say u(0) = 0, ^f(O) = 0. Assuming that R 2 > 4L/C, 
then the solution of (27) is 

u{t) = [ k(t - s)v(s)ds, (28) 

where 

= m a 2 ) 

and Ai, A2 are the (distinct) roots of LA 2 + J?A + ^ = 0. 

This problem may be phrased in terms of linear operators. Let X = C[0, 00); 
then the transformation defined by T(v) — u in (28) is a linear operator from X to 
X. 
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Similarly, (27) can be written in the form S(u) — v for some linear operator 
S. However, S cannot be denned on all of X - only on the dense linear subspace 
of twice-differentiable functions. The transformations T and S are closely related, 
and we would like to develop a framework for viewing them as inverse to each other. 

Definition 3.1. A linear transformation T : X — > Y is (algebraically) invert- 
ible if there is a linear transformation S : Y — > X with the property that TS = ly 
and ST = lx- 

For example, in Example 3.1, if we take X = C[0, oo) and Y = C 2 [0, oo), then 
T is algebraically invcrtiblc with T^ 1 = S. 

Definition 3.2. A linear operator T : X — > Y is bounded if there is a constant 
K such that 

||Ta;||r < K\\x\\ x for all x e X. 
The norm of the bounded linear operator T is 

||r||=Bup{&}. (29) 
x^o { \\x\\x J 

Example 3.2. In Example 3.1, the operator T is bounded when restricted to 
any C[0, a] for any a, since 



|Tu(a)| < f \k(t-s)\ ■ \v(s)\ds, 
Jo 



which shows that 

a||«||oo 



HTwIloo < a sup |A(t)|||v||oo < 



0<t<a L\^l — A2I 

The operator S is not bounded of course - think about what differentiation does. 

Exercise 3.1 (1). Show that ||T|| = sup|| a .|| =1 {||T , a;|| 1 -}. 
[2] Prove the following useful inequality: 

\\Tx\\ Y < \\T\\ ■ \\x\\ x for all x e X. (30) 

Theorem 3.1. A linear transformation T : X — > Y is continuous if and only 
if it is bounded. 

Proof. If T is bounded and x n — > 0, then by Definition 3.2, Tx n — > also. It 
follows that T is continuous at 0, so by Lemma 3.1 T is continuous everywhere. 

Conversely, suppose that T is continuous but unbounded. Then for any neN 
there is a point x n with ||Tx„|| > n||x„||. Let y n = | , so that y n — > as n — > 00. 
On the other hand, ||Tj/ n || > 1 and T(0) = 0, contradicting the assumption that T 
is continuous at 0. □ 

2. The space of linear operators 

The set of all linear transformations X — > F is itself a linear space with the 
operations 

(T + S)(x) = Tx + Sx, (XT)(x) = XTx. 

Denote this linear space by C(X,Y). If X and Y arc normed spaces, denote by 
B(X, Y) the subspace of continuous linear transformations. If X — Y, then write 
C(X, X) = C(X) and B(X, X) = B(X). 
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Lemma 3.2. Let X and Y be normed spaces. Then B{X 1 Y) is a normed linear 
space with the norm (29). If in addition Y is a Banach space, then B{X 1 Y) is a 
Banach space. 

Proof. We have to show that the function T ||T|| satisfies the conditions 
of Definition 1.4. 

(1) It is clear that ||T|| > since it is defined as the suprcmum of a set of non- 
negative numbers. If ||T|| = then ||Tx||y = for all x, so Tx — for all x - that 
is, T = 0. 

(2) The triangle inequality is also clear: 

||T + 5||= sup \\(T + S)x\\< sup ||Tx|| + sup \\Sx\\ = \\T\\ + \\S\\. 

\\x\\=l 11x11 = 1 11x11 = 1 

(3) ||AT|| = su P || x || =1 \\(\T)x\\ = |A|su P || x || =1 ||Tx|| = |A|||T||. 

Finally, assume that Y is a Banach space and let (T n ) be a Cauchy sequence in 
B(X, Y). Then the sequence is bounded: there is a constant K with ||T„x|| < ^||x|| 
for all x G X and n > 1. Since \\T n x — T m x\\ < \\T n — T m \\ \\x\\ — > as n > m — > oo, 
the sequence (T n x) is a Cauchy sequence in Y for each x € X. Since Y is complete, 
for each x e X the sequence (T n x) converges; define T by 

Tx = lim T n x. 

n — *oo 

It is clear that T is linear, and ||Tx|| < K\\x\\ for all x, so T e B{X, Y). 

We have not yet established that T n -> T in the norm of B(X, Y) (cf. 29). 
Since (T n ) is Cauchy, for any e > there is an N such that 

\\T m - T n \\ <e for all m > n > N. 

For any x e X we therefore have 

\\T m x - T n x\\ Y < e\\x\\ x - 

Take the limit as m oo to see that 

\\Tx-T n x\\ < e||x||, 

so that \\T - T n \\ < e if n > N. This proves that \\T - T n \\ -» as n -> oo. □ 

Example 3.3. Once the space of linear operators is known to be complete, we 
can do analysis on the operators themselves. For example, if X is a Banach space 
and A e B(X), then we may define an operator 

e A = I + A+±A 2 +^ + ..., 

which makes sense since 

l|e A || < 1 + P|| + ^||A|| 2 + ... 
< el'- 4 ". 

This is particularly useful in linear systems theory and control theory; if x(t) £ M. n 
then the linear differential equation ^ = Ax(t), x(0) = x , where A is an n x n 
matrix, has as solution x(t) — e At x n . 
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3. Banach algebras 

In many situations it makes sense to multiply elements of a normcd linear space 
together. 

Definition 3.3. Let X be a Banach space, and assume there is a multiplication 
(x, y) i — ► xy from IxI^I such that addition and multiplication make X into 
a ring, and 

INII < NIIMI- 

Then X is called a Banach algebra. 

Recall that a ring does not need to have a unit; if X has a unit then it is called 
unital. 

Example 3.4. [1] The continuous functions C[0, 1] with sup norm form a Ba- 
nach algebra with (fg)(x) = f(x)g(x). 

[2] If X is any Banach space, then B(X) is a Banach algebra: 

||ST||= sup ||(SI>||= sup \\S(Tx)\\ < \\S\\ sup \\Tx\\ = ||5|| ||T||. 

||x||=l ||x|| = l 11x11 = 1 

The algebra has an identity, namely I{x) = x. 

[3] A special case of [2] is the case X = R™. By choosing a basis for R™ we may 
identify B(R n ) with the space of n x n real matrices. 

In the next few sections we will prove the more technical results about linear 
transformations that provide the basic tools of functional analysis. 

4. Uniform boundedness 

The first theorem is the principle of uniform boundedness or the Banach- 
Steinhaus theorem. 

Theorem 3.2. Let X be a Banach space and let Y be a normed linear space. 
Let {T a } be a family of bounded linear operators from X into Y. Lf for each x e X, 
the set {T a x} is a bounded subset of Y , then the set {\\T a \\} is bounded. 

Proof. Assume first that there is a ball B e (xo) on which {T a x} (a set of 
functions) is uniformly bounded: that is, there is a constant K such that 

\\T a x\\<K if Hz -soil <e- (31) 

Then it is possible to find a uniform bound on the whole family {||T a ||}. For any 
y ^ define 

z = T^r-y + XQ. 

\\v\\ 

Then z € B c (xq) by construction, so (31) implies that ||T Q z|| < K. 
Now by linearity of T a this shows that 



Arll^S/ll - ll^aroll < 

\y\\ 



\y\\ 



\T n z\\ < K, 



which can be solved for ||T a j/|| 
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where K' — sup Q ||T a xo|j < oo. It follows that 

„ „ K + K 1 
\\T a \\ < 

as required. 

To finish the proof we have to show that there is a ball on which property (31) 
holds. This is proved by a contradiction argument: assume for not that there is no 
ball on which (31) holds. Fix an arbitrary ball Bq. By assumption there is a point 
x\ e B such that 

||T ai a;i|| > 1 

for some index a\ say. Since each T a is continuous, there is a ball B ei (xi) in which 
||T Ql (xi)|| > 1. Assume without loss of generality that e\ < 1. By assumption, in 
this new ball the family {T a x} is not bounded, so there is a point x 2 £ B €l (xi) 
with 

\\T a2 x 2 \\>2 

for some index a 2 ^ ct\. Continue in the same way: by continuity of ct 2 there is 
a ball B €2 (x 2 ) c B €l (xi) on which ||T Q2 x|| > 2. Assume without loss of generality 
that e 2 <\. 

Repeating this process produces points xs, X4, X5, . . . , indices 013, ola, a$, . . . , 
and positive numbers e 3 ,e4,e 5 ,... such that B €n (x n ) C B enl (x n _i), e„ < i, all 
the atj's are distinct, and 

||T an x|| > n for all x € B tn {x n ). 

Now the sequence (x„) is clearly Cauchy and therefore converges to z G X say 
(cquivalcntly, prove that Hn=i ^e„(^n) contains the single point z). 

By construction, ||T Qn z|| > n for all n > 1, which contradicts the hypothesis 
that the set {T a z} is bounded. □ 

Recall the operator norm in Definition 3.2. Corresponding to this norm there 
is a notion of convergence in B(X,Y): we say that a sequence (T„) is uniformly 
convergent if there is T e B(X,Y) with \\T n — T\\ — > as n — > 00 (so uniform 
convergence of a sequence of operators is simply convergence in the operator norm). 

Definition 3.4. A sequence (T n ) in B(X,Y) is strongly convergent if, for any 
x E X, the sequence (T n x) converges in Y. If there is a J 1 £ B(X,Y) with 
lim„ T n x = Tx for all x £ X, then (T„) is strongly convergent to T. 

Exercise 3.2 (1). Prove that uniform convergence implies strong convergence. 
[2] Show by example that strong convergence does not imply uniform convergence. 

Theorem 3.3. Let X be a Banach space, and Y any normed linear space. If 
a sequence (T n ) in B(X,Y) is strongly convergent, then there exists T € B(X, Y) 
such that (T n ) is strongly convergent to T . 

Proof. For each x e X the sequence (T n x) is bounded since it is convergent. 
By the uniform boundedness principle (Theorem 3.2), there is a constant K such 
that ||T n || < K for all n. Hence 

\\T n x\\ < K\\x\\ for all x e X. (32) 

Define T by requiring that Tx = lim^oo T n x for all x e X. It is clear that T is 
linear, and (32) shows that ||Tx|| < K \\x\\ for all x e X, showing that T is bounded. 
The construction of T means that (T n ) converges strongly to T. □ 
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5. An application of uniform boundedness to Fourier series 

This section is an application of Theorem 3.2 to Fourier analysis. We will 
encounter Fourier analysis again, in the context of Hilbert spaces and L 2 functions. 
For now we take a naive view of Fourier analysis: the functions will all be continuous 
periodic functions, and we compute Fourier coefficients using Ricmann integration. 



Lemma 3.3. 

sin(n + \)x 

la 



f 

Jo 



sin \ x 



dx — > oo as n—f oo. 



PROOF. Recall that |sin(x)| < |x| for all x. It follows that 

f 27T sin(n+ \)x , [ 2 * 2, . , 1 

/ ^ — dx > — \sm{n-\ — )x\ax. 

Jo sin^x Jo x 2 



2" 

Now I sin(n + \)x\ > \ for all x with (n + |)x between fc7r + ^tt and kn + ^tt for 
k = 1, 2, . . . . It follows (by thinking of the Riemann approximation to the integral) 
that 

/ - sin(n + -)x \dx > > — ^ =- n+- > r ^ oo 

as n — > 00. □ 



Definition 3.5. If / : (0,27r) — » R is Riemann-integrable, then the Fourier 
series of / is the series 



Extend the definition of / to make it 27r-periodic, so f(x + 2ir) — f(x) for all 
x. Define the nth partial sum of the Fourier series to be 



s n(x) — ^ ' a m e 



The basic questions of Fourier analysis arc then the following: is there any relation 
between s(x) and /(x)? Does the function s n (x) approximate f(x) for large n in 
some sense? 



Lemma 3.4. Let 1 DJx) = sin< - n +^ x . Then 

" v ' sin 

1 /- 2 " 

s„(y) = ^ y f(y + x)D n (x)dx. 

Proof. Exercise. □ 

Now let X be the Banach space of continuous functions / : [0,27r] — > R with 
/(0) = /(27r), with the uniform norm. 



1 This function is called the Dirichlet kernel. For the lemma, it is helpful to notice that 
D n (x) = ^2™-_ n e IJa: . If you read up on Fourier analysis, it will be helpful to note that the 
Dirichlet kernel is not a "summability kernel". 
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Lemma 3.5. The linear operator T n : X — > M defined by 

1 r 2 * 

Tn ^ = 2nJ f^ D ^ dx 

is bounded, and 

W T n\\ = ^l" \Dn(x)\dx. 

Proof. For any / el, 



\T n (f)\ < — / \f(x)\\D n (x)\dx < -11/11 / \D n (x)\dx, 



so 



1 

r 

Assume that for some S > we have 



271 



T n \\ < ^ / \D n {x)\dx. 



T n\\ = ^j Q \D n (x)\dx-S. 



(33) 



Then since for fixed n \D n (x)\ < M n is bounded, we may find a continuous function 
/„ that differs from sign(Z?„(a;)) on a finite union of intervals whose total length 
does not exceed jj-6. Then (don't think about this - just draw a picture) 

1 /" 27r 1 f 27T 

|— J f n (x)D n (x)dx\ >—J \D n (x)\dx - 6, 

which contradicts the assumption (33). We conclude that 

1 f 2n 

|D n (a;)|da;. 



2tt 



□ 



We are now ready to see a genuinely non-trivial and important observation 
about the basic theorems of Fourier analysis. 

Theorem 3.4. There exists a continuous function f : [0, 2ir] — > R, with /(0) = 
/(27r), such that its Fourier series diverges at x = 0. 

Proof. By Lemma 3.4, we have 

T n (f) - *„(0) 

for all / e X. Moreover, for fixed / <G X, if the Fourier series of / converges at 0, 
then the family {T n f} is bounded as n varies (since each element is just a partial 
sum of a convergent series). Thus if the Fourier series of / converges at for all 
/ e X, then for each / £ X the set {T n f} is bounded. By Theorem 3.2, this 
implies that the set {||T„||} is bounded, which contradicts Lemma 3.5. 

The conclusion is that there must be some / € X whose Fourier series does not 
converge at 0. □ 

Exercise 3.3. The problem of deciding whether or not the Fourier series of a 
given function converges at a specific point (or everywhere) is difficult and usually 
requires some degree of smoothness (differentiability) . You can read about various 
results in many books - a good starting point is Fourier Analysis, Tom Korncr, 
Cambridge University Press (1988). 
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It is more natural in functional analysis to ask for an appropriate semi-norm 
in which \\s(x) — f(x)\\ = for some class of functions /. 

6. Open mapping theorem 

Recall that a continuous map between normed spaces has the property that the 
pre-image of any open set is open, but in general the image of an open set is not 
open (Exercise 1.1). Bounded linear maps between Banach spaces cannot do this. 

Theorem 3.5. Let X and Y be Banach spaces, and let T be a bounded linear 
map from X onto Y . Then T maps open sets in X onto open sets in Y . 

Of course the assumption that X maps onto Y is crucial: think of the projection 
(x,y) i — ► (x, 0) from M 2 — > M 2 . This is bounded and linear, but not onto, and 
certainly cannot send open sets to open sets. 

The proof of the Open-Mapping theorem is long and requires the Baire category 
theorem, so it will be omitted from the lectures. For completeness it is given here 
in the next three lemmas. 

Some notation: use B* and B% to denote the open balls of radius r centre 
in X and Y respectively. 

Lemma 3.6. For any e > 0, there is a S > such that 

TBI ^ Bj. (34) 

PROOF. Since X = (J~ =1 nBf, and T is onto, we have Y = T(X) = U~ =1 nTB* . 
By the Baire category theorem (Theorem A. 4) it follows that, for some n, the set 
nTBf contains some ball B^(z) in Y. Then TBf must contain the ball Bj(y ), 
where y — -z and 6 = -r. It follows that the set 

P = {yi - VI | yi G Bj(y ) 7 y 2 G Bj (y )} 

is contained in the closure of the set TQ, where 

Q = {xi- x 2 | x x G B?,x 2 G B* } C B£. 

Thus, TB^ C P. Any point y G Bj can be written in the form y = (y + y ) — y , 
so Bj C P. and (34) follows. □ 

Lemma 3.7. For any e > there is a S > such that 

TB* o DBl. (35) 

Proof. Choose a sequence (e„) with each e„ > and X^^Li e n < e o- By 
Lemma 3.6 there is a sequence (5 n ) of positive numbers such that 

TB? n D Bj n (36) 

for all n > 1. Without loss of generality, assume that 5 n — > as n — > oo. 

Let y be any point in BJ . By (36) with n = there is a point x$ £ B* with 
\\y — Txo\\ < Si. Since (y — Txo) G Bj, (36) with n = 1 implies that there exists a 
point iri G such that \\y — Txq — Tx\\\ < 82- Continuing, we obtain a sequence 
(x n ) such that x n G B^ for all n, and 

n \ 

<S n+1 . (37) 
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Since \\x n \\ < e n , the series J2 n Xn ^ s absolutely convergent, so by Lemma 2.1 it is 
convergent; write x — J2 n x n- Then 

oo oo 
\\x\\<J2\\ X n\\<J2 en<2e °- 

The map T is continuous, so (37) shows that y = Tx since 6 n — > 0. 

That is, for any y e Bj o we have found a point x € B^ such that Tx = y, 
proving (35). □ 

Lemma 3.8. For any open set G C X and for any point y — Tx, ieG, there 
is an open ball B Y such that y + B Y C T(G) . 

Notice that Lemma 3.8 proves Theorem 3.5 since it implies that T(G) is open. 

Proof. Since G is open, there is a ball B* such that x + Bf C G. By Lemma 
3.7, T(B? ) D B Y for some r) > 0. Hence 

T(G) D T(x + B?) = T(x) + T{Bf) D y + B Y q . 

□ 

As an application of Theorem 3.5, we establish a general property of inverse 
maps. Generalizing Definition 3.1 slightly, we have the following. 

Definition 3.6. Let T : X — > Y be an injective linear operator. Define the 
inverse of T, T _1 by requiring that 

T~ 1 y = x if and only if Tx = y. 

Then the domain of T^ 1 is a linear subspacc of Y, and T _1 is a linear operator. 

It is easy to check that T~ 1 Tx = x for all x € X, and TT~ 1 y = y for all y in 
the domain of T _1 . 

Lemma 3.9. Let X and Y be Banach spaces, and let T be an injective bounded 
linear map from X to Y . Then T~ x is a bounded linear map. 

Proof. Since T _1 is a linear operator, we only need to show it is continuous 
by Theorem 3.1. By Theorem 3.5 (T -1 ) -1 maps open sets onto open sets. By 
Exercise 1 .1 [1] , this means that T _1 is continuous. □ 

Corollary 3.1. If X is a Banach space with respect to two norms \\ ■ \\^ and 
j| • || ( 2 ) and there is a constant K such that 

\\x\\ {1) < K\\x\\ (2 \ 

then the two norms are equivalent: there is another constant K' with 

||s||( 2 > < K'\\x\\M 

for all x e X. 

PROOF. Consider the map T : x ^ x from (X,\\ ■ ||M) to (X,\\ ■ ||W). By 
assumption, T is bounded, so by Lemma 3.9, T _1 is also bounded, giving the 
bound in the other direction. □ 
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Definition 3.7. Let T : X — > Y be a linear operator from a normed linear 
space X into a normed linear space Y, with domain Dt- The graph of T is the set 

G T = {(x, Tx)\xe D T } CX xY. 

If Gt is a closed set inlxr" (see Example 1.7) then T is a closed operator. 

Notice as usual that this notion becomes trivial in finite dimensions: if X and 
Y are finite-dimensional, then the graph of T is simply some linear subspace, which 
is automatically closed. The next theorem is called the closed-graph theorem. 

Theorem 3.6. Let X and Y be Banach spaces, and T : X — > Y a linear 
operator (notice that the notation means Dt = X). If T is closed, then it is 
continuous. 

Proof. Fix the norm ||(x,y)|| = ||x||x + ||y||y on X x Y. The graph Gt is, 
by linearity of T, a closed linear subspace in X x Y, so Gt is itself a Banach space. 
Consider the projection P : Gt — * X defined by P(x,Tx) = x. Then P is clearly 
bounded, linear, and bijective. It follows by Lemma 3.9 that P^ 1 is a bounded 
linear operator from X into Gt, so 

\\(x,Tx)\\ = ||P _1 a;|| < K\\x\\ x for all x e X, 

for some constant K. It follows that ||x||x + ll^llr < -ftT||x||x for all x G X, so T 
is bounded - and therefore T is continuous by Theorem 3.1. □ 

7. Hahn Banach theorem 

Let X be a normed linear space. A bounded linear operator from X into the 
normed space R is a (real) continuous linear functional on X. The space of all 
continuous linear functionals is denoted B(X,R) = X* , and it is called the dual or 
conjugate space of X. All the material here may be done again with C instead of 
R without significant changes. 

Notice that Lemma 3.2 shows that X* is itself a Banach space independently 
of X. 

One of the most important questions one may ask of X* is the following: are 
there "enough" elements in X*l (to do what we need: for example, to separate 
points). This is answered in great generality using the Hahn-Banach theorem 
(Theorem 3.7 below); see Corollary 3.4. First we prove the Hahn-Banach lemma. 

Lemma 3.10. Let X be a real linear space, and p : X — > R a continuous func- 
tion with 

p(x + y) < p(x) + p{y) , p{\x) = \p(x) for all A>0,x,j/eA. 

Let Y be a subspace of X, and f G Y* with 

f(x) < p{x) for all y € Y. 

Then there exists a functional F G X* such that 

F(x) = f(x) for x G Y; F(x) < p(x) for all x G X. 

Proof. Let JC be the set of all pairs (Y a ,g a ) in which Y a is a linear subspace 
of X containing Y, and g a is a real linear functional on Y a with the properties that 

9a (x) = /(x) for all x G Y, g a (x) < p(x) for all x G Y a . 
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Make JC into a partially ordered set by denning the relation (Y a ,g a ) < (Yp,gp) if 
Y a C Yp and g a = gp on Y a . It is clear that any totally ordered subset {(Y\,g\) 
has an upper bound given by the subspace \J X Y\ and the functional defined to be 
gx on each Y\. 

By Theorem A.l, there is a maximal element (Y ,g ) in JC. All that remains is 
to check that F is all of X (so we may take F to be go)- 

Assume that y\ € X\Y . Let Yi be the linear space spanned by Y an d yi- 
each element x GY\ may be expressed uniquely in the form 

x = y + \y u y e y ,A G E, 

because y\ is assumed not to be in the linear space Yq. Define a linear functional 
5i e Y* by + Aj/i) = .g (y) + Ac. 

Now we choose the constant c carefully. Note that if x ^ y arc in Y , then 

9o{y)-9o{x) =9o{y-x) <p{y-x) < p(y + yi) +p(-yi -x), 

so 

-p{-Vi -x)- g (x) <p(y + yi)- g a (y). 

It follows that 

A = sup {-p(-yi -x)- g {x)} < inf {p(y + Vl ) - g Q (y)} = B. 

x£Y V€ Y o 

Choose c to be any number in the interval [A, B]. Then by construction of A and 
B, 

c<p{y + yi)- g (y) for all y e Y , (38) 

- P(-Vi -y)~ 9o{y) < c for all y e Y . (39) 

Multiply (38) by A > and substitute j for y to obtain 

\c<p(y+\ yi )-g Q (y). (40) 

Now multiply (39) by A < 0, substitute j for y and use the homegeneity assumption 
on p to obtain (40) again. Since (40) is clear for A = 0, we deduce that 

9i(y + Ayi) = g (y) + Ac < p(y + Xyi) 

for all A e M and y e F . That is, (Y 1 ,g 1 ) e /C and (Y , g ) < (Y 1 ,g 1 ) with Y o7 ^Y 1 . 
This contradicts the maximality of (Y ,g ). □ 

For real linear spaces, the Hahn-Banach theorem follows at once (for complex 
spaces a little more work is needed). 

Theorem 3.7. Let X be a real normed space, and Y a linear subspace. Then 
for any y* € Y* there corresponds an x* € X* such that 

\\ x *\\ = \\y*\\, and x*(y)—y*{y) for all y <EY. 

That is, any linear functional defined on a subspace may be extended to a linear 
functional on the whole space with the same norm. 

PROOF. Let p(x) = ||j/*||||x||, f(x) = y*(x), and x* = F. Apply the Hahn- 
Banach Lemma 3.10. To check that ||a;*|| < \\y\\, write x*(x) = 0\x*(x)\ for 9 = ±1. 
Then 

\x*(x)\ - dx*(x) - x*(8x) <p(8x) = \\y*\\\\8x\\ = \\y*\\\\x\\. 
The reverse inequality is clear, so ||x*|| = \\y*\\. □ 
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Many useful results follow from the Hahn-Banach theorem. 

Corollary 3.2. Let Y be a linear subspace of a normed linear space X, and 
let Xq G X have the property that 

inf \\y- x || =d>0. (41) 

Then there exists a point x* G X* such that 

x*(x ) = 1, ||x*|| = ^ x*(y) = for all y G Y . 

Proof. Let Yi be the linear space spanned by Y and x . Since x ^ Y, every 
point x in Y\ may be represented uniquely in the form x = y + Xx , with y G Y, 
Ael. Define a linear functional z* G Y{ by z*(y + \x ) = A. If A ^ 0, then 

||y + Aasoll = |A| | + x > |A|d. 

It follows that |z*(x)| < 1 1 x 1 1 /d for all x G Yi, so ||z*|| < ^. Choose a sequence 
(?/„) C Y with ||x — y n \\ — > d as n — > oo. Then 

1 = z*(a; - y„) < ||z*||||a; - y„|| -» ||z*||d, 

so ||z*|| = \. Apply Theorem 3.7 to z* . □ 

COROLLARY 3.3. Let X be a normed linear space. Then, for any x ^ in X 
there is a functional x* G X* with \\x*\\ = 1 and x*(x) = \\x\\. 

Proof. Apply Corollary 3.2 with Y = {0} to find z* = X* such that ||z*|| = 
l/||x||, z*(x) = 1. We may therefore take x* to be ||x||z*. □ 

Corollary 3.4. If z ^ y in a normed linear space X, then there exists x* G 
X* such that x*(y) ^ x*(z). 

PROOF. Apply Corollary 3.3 with x = y — z. □ 
Corollary 3.5. If X is a normed linear space, then 
\\x\\ = sup = sup \x*(x)\. 

\\X || ||x*||=l 

Proof. The last two expressions are clearly equal. It is also clear that 

sup |a;*(a;)| < ||x||. 

Ik* ll =1 

By Corollary 3.3, there exists Xq such that Xg(x) = ||x|| and ||xo|| = 1, so 



sup |x*(x)| > ||x||. 

Mas- 11=1 



□ 



COROLLARY 3.6. Let Y be a linear subspace of the normed linear space X. If 
Y is not dense in X, then there exists a functional x* ^ such that x*(y) = for 
all y G y '. 

Proof. Notice that if there is no point x G X satisfying (41) then Y must be 
dense in X. So we may choose xq with (41) and apply Corollary 3.2). □ 
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Notice finally that linear functionals allow us to decompose a linear space: let 
X be a normed linear space, and x* e X* . The null space or kernel of x* is the 
linear subspace N x * — {x € X \ x*(x) = 0}. If x* ^ 0, then there is a point 
xo 7^ such that x*(xq) = 1. Any element x e X can then be written x = z + Xx , 
with A = x*(x) and z = x — Xx e N x *. Thus, X = N x * © Y, where F is the 
one-dimensional space spanned by xq. 
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Integration 

We have seen in Examples 1.10 [3] that the space C[0, 1] of continuous functions 
with the p-norm 

l 1 

Wf\\p=(l l/(*)l Pd * 

is not complete, even if we extend the space to Riemann intcgrablc functions. 

As discussed in the section on completions, we can think of the completion of 
the space in terms of all limit points of (equivalence classes) of Cauchy sequences. 
This does not give any real sense of what kind of functions are in the completion. In 
this chapter we construct the completions L p for 1 < p < oo by describing (without 
proofs) the Lebesgue 1 integral. 

1. Lebesgue measure 

Definition 4.1. Let B denote the smallest collection of subsets of R that in- 
cludes all the open sets and is closed under countable unions, countable intersections 
and complements. These sets are called the Borel sets. 

In fact the Borel sets form a o-algebra: R, G B, and B is closed under 
countable unions and intersections. We will call Borel sets measurable. Many 
subsets of R are not measurable, but all the ones you can write down or that might 
arise in a practical setting are measurable. 

The Lebesgue measure on R is a map /z : B — > R U {oo} with the properties 

that 

(i) fj,[a, b] = fi(a, b) = b — a; 

(h) / i(u~ 1 A,) = £~=iMA0- 

Notice that the Lebesgue measure attaches a measure to all measurable sets. 
Sets of measure zero are called null sets, and something that happens everywhere 
except on a set of measure zero is said to happen almost everywhere, often written 
simply a.e. For technical reasons, allow any subset of a null set to also be regarded 
as "measurable" , with measure zero. 

Exercise 4.1. [1] Prove that (j,(Q) = 0. Thus a.e. real number is irrational. 

1 Henri Leon Lebesgue (1875-1941), was a French mathematician who revolutionized the field 
of integration by his generalization of the Riemann integral. Up to the end of the 19th century, 
mathematical analysis was limited to continuous functions, based largely on the Riemann method 
of integration. Building on the work of others, including that of the French mathematicians Emilc 
Borel and Camille Jordan, Lebesgue developed (in 1901) his theory of measure. A year later, 
Lebesgue extended the usefulness of the definite integral by defining the Lebesgue integral: a 
method of extending the concept of area below a curve to include many discontinuous functions. 
Lebesgue served on the faculty of several French universities. He made major contributions in 
other areas of mathematics, including topology, potential theory, and Fourier analysis. 
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[2] More can be said: call a real number algebraic if it is a zero of some polynomial 
with rational coefficients, and transcendental if not. Then a.e. real number is 
transcendental. 

[3] Prove that for any measurable sets A, B, fi{A UB) = (i{A) + fi(B) - fi(A n B). 
[4] Can you construct 2 a set that is not a member of £>? 

Definition 4.2. A function / : R — > R U {±00} is a Lchcsguc measurable 
function if € B for every 4g A 

Example 4.1. [1] The characteristic function xq, defined by xq( x ) = 1 if 
x G Q, and = if x ^ Q is an example of a measurable function that is not 
Ricmann integrablc. 

[2] All continuous functions are measurable (by Exercise 1 . 1 [1]) . 

The basic idea in Riemann integration is to approximate functions by step 
functions, whose "integrals" are easy to find. These give the upper and lower 
estimates. In the Lcbesgue theory, we do something similar, using simple functions 
instead of step functions. 

A simple function is a map / : R — > R of the form 



i=i 

where the Cj are non-zero constants and the Ei are disjoint measurable sets with 
u(Ei) < 00. 

The integral of the simple function (42) is defined to be 



for any measurable set E. 

The basic approximation fact in the Lebesgue integral is the following: if / : 
R — > RU{±oo} is measurable and non-negative, then there is an increasing sequence 
(/„) of simple functions with the property that f n {t) — » f(t) a.e. We write this as 
/„ t / a - c -> an d define the integral of / to be 



Notice that (once we allow the value 00), the limit is guaranteed to exist since the 
sequence is increasing. 

2 This "construction" requires the use of the Axiom of Choice and is closely related to the 
existence of a Hamel basis for R as a vector space over Q. The question really has two faces: 
1) using the usual axioms of set theory (including the Axiom of Choice), can you exhibit a non- 
measurable subset of R? 2) using the usual axioms of set theory without the Axiom of Choice, is 
it still possible to exhibit a non-measurable subset of R? 

The first question is easily answered. The second question is much deeper because the answer 
is "no" . This is part of a subject called Model Theory. Solovay showed that there is a model of 
set theory (excluding the Axiom of Choice but including a further axiom) in which every subset 
of R is measurable. Shelah tried to remove Solovay's additional axiom, and answered a related 
question by exhibiting a model of set theory (excluding the Axiom of Choice but otherwise as 
usual) in which every subset of R has the Baire property. The references are R.M. Solovay, "A 
model of set-theory in which every set of reals is Lebesgue measurable" , Annals of Math. 92 
(1970), 1-56, and S. Shelah, "Can you take Solovay's inaccessible away?", Israel Journal of Math. 
48 (1984), 1-47 but both of them require extensive additional background to read. 
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For a general measurable function /, write / — f + — f where both /+ and 
/~ are non-negative and measurable, then define 

f fd/i = f f + du, - f f-d». 
Je Je Je 

Example 4.2. Let f(x) = Xqn[o.i]( x )- Then / is itself a simple function, so 

/d/x = /i(Qn[o,i]) = o. 



/ 

Jo 



>0 

A measurable function / on [a, b] is essentially bounded if there is a constant 
K such that |/(x)| < K a.e. on [a, b\. The essential supremum of such a function 
is the infimum of all such essential bounds K, written 

|| /|| oo = ess.sup. [0i6] |/|. 

Definition 4.3. Define C p [a, b] to be the linear space of measurable functions 
/ on [a, b) for which 

\ VP 

\f\pdA <™ 

for pg [1, oo) and £oo[a, b] to be the linear space of essentially bounded functions. 
Notice that || • || p on C p is only a semi-norm, since many functions will for example 
have \\f\\p = 0. Define an equivalence relation on C p by / <~ g if {x e R \ f{x) ^ 
g(x)} is a null set. Then define 

L p [a, b] = C p / — , 

the space of L p functions. 

In practice we will not think of elements of L p as equivalence classes of functions, 
but as functions defined a.e. A similar definition may be made of p-integrable 
functions on R, giving the linear space L p (R). 

The following theorems are proved in any book on measure theory or modern 
analysis or may be found in any of the references. Theorem 4.1 is sometimes called 
the Riesz-Fischer theorem; Theorem 4.2 is Holder's inequality. 

Theorem 4.1. The normed spaces L p [a,b] and L p (R) are (separable) Banach 
spaces under the norm || • || p . 

Theorem 4.2. If 1 = 1 + ± then 

v p q ' 

\\fg\\r < \\f\M\g\U 

for any f £ L p [a,b], g e L q [a, b]. It follows that for any measurable f on [a, b], 

||/||1<||/||2<||/||3<---<||/|U. 

Hence 

Li [a, b] D L 2 [a, b] D ■ ■ ■ D [a,b]. 

In the theorem we allow p and q to be anything in [1, oo] with the obvious 
interpretation of ^. 

Note the "opposite" behaviour to the sequence spaces i p in Example 1.4[3], 
where we saw that 

l x ai 2 c ••• c4o. 
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Two easy consequences of Holder's inequality arc the Cauchy-Schwartz inequal- 
ity 

Wfgh < II/II2NI2 

and Minkowski's inequality, 

ll/ + sll P <ll/ll P + NI P . 

The most useful general result about Lebesgue integration is Lebesgue's domi- 
nated convergence theorem. 

Theorem 4.3. Let (/„) be a sequence of measurable functions on a measurable 
set E such that f n (t) — > f(t) a.e. and there exists an integrable function g such 
that \f„{t)\ < g(t) a.e. Then 

/ fd/J, = lim / f n dfi. 

Exercise 4.2. [1] Prove that the L p -norm is strictly convex for 1 < p < 00 
but is not strictly convex if p = 1 or 00. 

2. Product spaces and Fubini's theorem 

Let X and Y be two subsets of R. Let A, B denote the cr-algebra of Borel sets 
in X and Y respectively. 

Subsets of X x Y (Cartesian product) of the form 

A x B = {(x, y) : x e A,y e B} 

with A e A, B e B are called (measurable) rectangles. Let A x B denote the 
smallest cr-algebra onlxF containing all the measurable rectangles. Notice that, 
depite the notation, this is much larger than the set of all measurable rectangles. 
The measure space (X xY,Ax B) is the Cartesian product of (X, A) and (Y, B). 
Let fix and p, Y denote Lebesgue measure on X and Y . Then there is a unique 
measure A on X x Y with the property that 

\{Ax B) = ii x {A)y. n Y {B) 

for all measurable rectangles A x B. This measure is called the product measure 
of iix and py and we write A = \ix x f-Y- 

The most important result on product measures is Fubini's theorem. 

Theorem 4.4. If h is an integrable function on X xY, then x 1— > h(x,y) is an 
integrable function of X for a.e. y, y 1— > h(x,y) is an integrable function of y for 
a.e. x, and 



hd(p,x x f-y) = hdp,xd\iy — hd^yd^x- 
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Hilbert spaces 

We have seen how useful the property of completeness is in our applications 
of Banach-space methods to certain differential and integral equations. However, 
some obvious ideas for use in differential equations (like Fourier analysis) seem to 
go wrong in the obvious Banach space setting (cf. Theorem 3.4). It turns out that 
not all Banach spaces are equally good - there are distinguished ones in which the 
parallelogram law (equation (43) below) holds, and this has enormous consequences. 
It makes more sense in this section to deal with complex linear spaces, so from now 
on assume that the ground field is C. 

1. Hilbert spaces 

Definition 5.1. A complex linear space H is called a Hilbert 1 space if there 
is a complex- valued function (•,•): H x H — > C with the properties 

(i) (x, x) > 0, and (a;, x) = if and only if x = 0; 

(ii) (x + y,z) = (x, z) + (y, z) for all x, y, z £ H; 

(iii) (Xx, y) = A(x, y) for all x,y e H and A e C; 

(iv) (x,y) = (y,x) for all ijeC; 

(v) the norm defined by ||x|| = {x.x) 1 / 2 makes H into a Banach space. 

If only properties (i), (ii), (iii), (iv) hold then (H, (•,•)) 1S called an inner- 
product space. 

Notice that property (v) makes sense since by (i) (x,x) > 0, and we shall see 
below (Lemma 5.2) that || • || is indeed a norm. 

The function (•, •) is called an inner or scalar product, and so a Hilbert space 
is a complete inner product space. 

If the scalar product is real-valued on a real linear space, then the properties 
determine a real Hilbert space; all the results below apply to these. 

Notice that (iii) and (iv) imply that (x,Xy) — X(x,y), and (x,0) = (0,x) = 0. 

Example 5.1. [1] If X = C™, then (x,y) = YTi=\ x iVi makes C" into an n- 
dimcnsional Hilbert space. 

1 David Hilbert (1862-1943) was a German mathematician whose work in geometry had the 
greatest influence on the field since Euclid. After making a systematic study of the axioms of 
Euclidean geometry, Hilbert proposed a set of 21 such axioms and analyzed their significance. 
Hilbert received his Ph.D. from the University of Konigsbcrg and served on its faculty from 1886 
to 1895. He became (1895) professor of mathematics at the University of Gottingen, where he 
remained for the rest of his life. Between 1900 and 1914, many mathematicians from the United 
States and elsewhere who later played an important role in the development of mathematics 
went to Gottingen to study under him. Hilbert contributed to several branches of mathematics, 
including algebraic number theory, functional analysis, mathematical physics, and the calculus of 
variations. He also enumerated 23 unsolved problems of mathematics that he considered worthy 
of further investigation. Since Hilbert's time, nearly all of these problems have been solved. 
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[2] Let X = C[a, b] (complex-valued continuous functions). Then the inner-product 
{fid) — fa f{^)g{t)dt makes X into an inner-product space that is not a Hilbert 
space. 

[3] Let X = £ 2 (square-summable sequences; see Example 1.4[3]) with the inner- 
product ((x„), (y n )) — X^^Li x nVn- This is well defined by the Schwartz inequality 
Lemma 5.1, and it is a Hilbert space by Example 2.1 [3]. We shall see later that £2 
is the only £ p space that is a Hilbert space. 

[4] Let X = L 2 [a, b] with inner -product (/, g) — f(t)g(t)dt. Then X is a Hilbert 
space (by the Cauchy-Schwartz inequality and Theorem 4.1). 

Lemma 5.1. In a Hilbert space, 

\{x,y)\ < \\x\\\\y\\. 

PROOF. Assume that x,y are non-zero (the result is clear if x or y is zero), 
and let A e C. Then 

< (x + Ay, x + Ay) 

= ||a;|| 2 + |A| 2 ||y|| 2 + A(y,a;) + A(x,y) 
= \\x\\ 2 + \X\ 2 \\y\\ 2 + m[\(x,y)}. 

Let A = —re 10 for some r > 0, and choose 8 such that 9 = — arg(x, y) if (x, y) ^ 0. 
Then 

IM| 2 + r 2 ||y|| 2 >2r|(a;,y)||. 
Take r = ||a:||/||y|| to obtain the result. □ 

Lemma 5.2. The function defined by \\x\\ = {x,x) x / 2 is a norm on a Hilbert 
space. 

Proof. All the properties are clear except the triangle inequality. Since 
(x,y) + {y,x) = 2$t{x,y)<2\\x\\\\y\\, 

we have 

\\x + y\\ 2 = |W| 2 + ||y|| 2 + 0r,y) + (y,x) 

< ||, ; || 2 + ||y!| 2 + 2||x||||y||=(||x|| + ||y||) 2 , 
so \\x + y\\ < \\x\\ + \\y\\. □ 

Lemma 5.3. The norm on a Hilbert space is strictly convex (cf. Definition 
1.7). 

Proof. From the proof of Lemma 5.1, if |(a;,y)| — ||x||||y||, then x = —Ay. 
From the proof of Lemma 5.2 it follows that if ||x|| + ||y|| = \\x + y\\ and y ^ then 
x = —Ay. Hence if ||a;|| = \\y\\ = 1 and \\x + y\\ = 2, then |A| = 1 and |1 — A| = 2, 
so A = — 1 and x — y. □ 

Next there is the peculiar parallelogram law. 

Theorem 5.1. If H is a Hilbert space, then 

||x + y|| 2 + ||.x-y|| 2 = 2|M| 2 + 2||y|| 2 (43) 

for all x, y G H. 

Conversely, if H is a complex Banach space with norm \\ ■ \\ satisfying (43), 
then H is a Hilbert space with scalar product (•,•) satsifying \\x\\ — (x^) 1 / 2 . 
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Proof. The forward direction is easy: simply expand the expression 

(x + y, x + y) + (x - y, x - y). 
For the reverse direction, define 

(x, y)=\([\\ x + y\\*-\\x- y\\ 2 } +i[\\x + iy\\ 2 - \\x - ty\\ 2 } ) (44) 
(in the real case, with the second expression simply omitted). Since 

(x,x) = \\x\\ 2 + i\\xf\i + i\ 2 - l -\\xf\i-i\ 2 = \\x\\ 2 , 

the inner-product norm (x^x) 1 / 2 coincides with the norm ||a;||. 

To prove that (-,-) satisfies condition (ii) in Definition 5.1, use (43) to show 

that 

\\u + v + w\\ 2 + \\u + v ~ w\\ 2 ~ 2\\u + v\\ 2 + 2\\w\\ 2 , 
\\u — v + w\\ 2 + \\u — v — w\\ 2 = 2\\u — v\\ 2 + 2\\w\\ 2 . 

It follows that 

+ v — w\\ 2 — \\u — v + w\\ 2 ) + (j\u + v — w\\ 2 — \\u — v — w\\ 2 ) 

= 2||?i + i;|| 2 -2||m-w|| 2 , 

showing that 

5ft(u + w, v) + SR(u -w,v) = 23?(u, v). 
A similar argument shows that 

3(m + w, v) + 3(u — w, v) = 23(u, v), 

so 

(u + w, v) + (u — w, v) = 2(u, v). 
Taking w = u shows that (2u, v) = 2(u, v). Taking u + w = x, u — w = y, v = z 



then gives 



(x, z) + (y,z)=2[ ) =(x + y, z). 



To prove condition (iii) in Definition 5.1, use (ii) to show that 

{mx,y) = ((m-l)x + x,y) = ((m-l)x,y) + (x,y) 
= ((m-2)x,y) + 2(x,y) 

= m(x,y). 

The same argument in reverse shows that n(x/n, y) — (x, y), so (x/n, y) = (l/n)(x, y). 
If r = m/n (m, n € N) then 

r (x, V) = — {x, y) = to ( -, y) = (—x, yj = (rx, y). 

Now (x, y) is a continuous function in x (by (44)); we deduce that X(x, y) — (Xx, y) 
for all A > 0. For A < 0, 

Hx,y)-(Xx,y) = X(x,y) - (\X\(-x),y) = X(x,y) - \X\(-x,y) 
= X{x,y) + X{-x,y) = X{0,y)=0, 
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so (iii) holds for all Ael. For A = i, (iii) is clear, so if A = \i + iv, 

K x >y) = K x ,y) + iv( x ,y) = (v x ,y) + i (> yx ,y) 

= (m#, y) + (»w> y) = (A*, y) . 
Condition (iv) is clear, and (v) follows from the assumption that H is Banach 
space. 



□ 



2. Projection theorem 



Let H be a Hilbert space. A point * e H is orthogonal to a point y e H, 
written * _L y, if (*, y) — 0. For sets N, M in ff, * is orthogonal to N, written 
x _L TV, if (x, y) = for all y G N. The sets iV and M are orthogonal (written 
N _L M) if x _L M for all x e TV. The orthogonal complement of M is defined as 

M 1 - = {x e H | x _L M}. 

Notice that for any M, M 1 - is a closed linear subspace of H. 

Lemma 5.4. Let M be a closed convex set in a Hilbert space H. For every point 
x € H there is a unique point y <E M such that 

(45) 



£o - 2/o | 



inf ||x - y\\. 



That is, it makes sense in a Hilbert space to talk about the point in a closed 
convex set that is "closest" to a given point. 

Proof. Let d = infygM \\ x o ~ 2/11 an d choose a sequence (y n ) in M such that 
ll^o — as n ^ oo. By the parallelogram law (43), 

4||a:o- \{y m + Vn)\\ 2 + \\y m -Vn\\ 2 = 2||x -y m || 2 + 2||xo - y n \\ 2 

-> Ad 2 

as 772, 7i — > oo. By convexity (Definition 1.6), \{y m + y n ) € M, so 

4||x - i(y m + ?y„)|| 2 >4d 2 . 

It follows that \\y m — y„|| — > as m, n — > oo. Now _ff is complete and M is a closed 
subset, so lirrin^ooyn = yo exists and lies in M. Now ||x — j/o| — limn^oo ||x — 
y n \\ =d, showing (45). 

It remains to check that the point yo is the only point with property (45). Let 
yi be another point in M with 



Then 



x 



yo + 2/i 



|*o - 2/i 1 1 = inf ||x - y||. 

y£M 



< 11*0 -2/o || + 11*0 - 2/i I 



< 2 inf ||*o-2/|| < 2 

y£M 



* 



2/Q + 2/1 
2 



since (t/o + 2/i)/2 lies in M. It follows that 

2/o + 2/i 



x - 



= ll*o - 2/0 1| + ||*o - 2/1 1 



Since the Hilbert norm is strictly convex (Lemma 5.3), we deduce that x — 2/o = 
*o - 2/i, so yi = t/o- □ 
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This gives us the Orthogonal Projection Theorem. 

Theorem 5.2. Let M be a closed linear subspace of a Hilbert space H. Then 
any Xq G H can be written x = yo + z , yo G M, z G M 1 - . The elements yo,zo 
are determined uniquely by xq. 

Proof. If x G M then y — x and z = 0. If x Q ^ M, then let y be the 
point in M with 

\\xo - 3/0 II = inf. \\x - y\\ 
(this point exists by Lemma 5.4). Now for any y G M and A G C, yo + Ay G M so 

Iko - 2/o i 1 2 < Iko - yo - Ay|| 2 = ||ar - y || 2 - 23?A(y,x - j/ ) + |A| 2 ||y|| 2 . 
Hence 

-2nX(y,x Q -y ) + \X\ 2 \\y\\ 2 >0. 
Assume now that A = e > and divide by e. As e — ► we deduce that 

»(j/,*o - yo) < 0. (46) 

Assume next that A = — it and divide by e. As e — > 0, we get 

»(y,zo-yo) <0. (47) 

Exactly the same argument may be applied to — y since — y G M, showing that 
(46) and (47) hold with y replaced by —y. Thus (y,Xo — yo) = for all y £ M. It 
follows that the point z — x — yo lies in M^. 

Finally, we check that the decomposition is unique. Suppose that xq = y\ + z\ 
with ?/i e M and z\ G M- 1 . Then y - Vi = zi - z a G M n ikf- 1 = {0}. □ 

Corollary 5.1. If M is a closed linear subspace and M ^ H , then there exists 
an element z ^ such that z a _L M . 

Proof. Apply the projection theorem (Theorem 5.2) to any x G H\M. □ 

It follows that all linear functionals on a Hilbert space are given by taking inner 
products - the Riesz theorem. 

Theorem 5.3. For every bounded linear functional x* on a Hilbert space H 
there exists a unique element z G H such that x*(x) = (x,z) for all x G H. The 
norm of the functional is given by \\x*\\ = \\z\\. 

Proof. Let N be the null space of x*; N is a closed linear subspace of H. If 
N = H, then x* = and we may take x*(x) = (x, 0). li N ^ H, then by Corollary 
5.1 there is a point z G A^, z ^ 0. By construction, a = x*(z ) ^ 0. For any 
x G H, the point x — x*(x)z /a lies in N, so 

(x - x*(x)z /a, z ) = 0. 

It follows that 

,z ) = (x,zo). 

If we substitute z = ( Zo " Zo ) zq, we get x*{x) = [x, z) for all x G H . 

To check uniqueness, assume that x*(x) — (x,z') for all x G H. Then (x,z — 
z') = for all x G H, so (taking a; = z — z'), \\z — z'\\ = and therefore z = z' . 

Finally, 

\\x*\\= sup |a;*(a;)| = sup \(x,z)\\< sup (||x|| ||z||) = ||x||. 
Il^ll=i 11=11=1 \\x\\=i 



c (,)(-,, 
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On the other hand, 

||z|| 2 = (z,z) = \x*(z)\ < ||x*||||z||, 
so \\z\\ < \\x*\\. □ 

Corollary 5.2. If H is a Hilbert space, then the space H* is also a Hilbert 
space. The map a : H — > H* given by (ax)(y) = (y,x) is an isometric embedding 
ofH onto H*. 

Definition 5.2. Let M and N be linear subspaces of a Hilbert space H. If 
every element in the linear space M + N has a unique representation in the form 
x + y, x G M, y G N, then we say M + N is a direct sum. If M _L N, then we write 
M © N - and this sum is automatically a direct one. If Y = M © N, then we also 
write N ~ Y Q M and call TV the orthogonal complement of M in F. 

Notice that the projection theorem says that if M is a closed linear space in 
H, then H = M®M ± . 

3. Projection and self adjoint operators 

Definition 5.3. Let M be a closed linear subspace of the Hilbert space H . By 
the projection theorem, every x G H can be written uniquely in the form x = y + z 
with y G M, z G M 1 - . Call y the projection of x in M, and the operator P = Pm 
defined by Px = y is the projection on M. The space M is called the subspace of 
the projection P. 

Definition 5.4. Let T : H — > be a bounded linear operator. The adjoint 
T* of T is defined by the relation (Tx, y) = (x, T*y) for all x,y £ H. An operator 
with T = T* is called self-adjoint. 

Notice that if T is self-adjoint, then for every x £ H, (Tx, x) e E. 

Exercise 5.1. Let T and S be bounded linear operators in Hilbert space H, 
and A e C. Prove the following: (T+S 1 )* = T* + S**; (TS)* = S*T*; (XT)* = XT*; 
P = J; T** = T; ||T*|| = ||T||. If T^ 1 is also abounded linear operator with domain 
H, then (T*)^ 1 is a bounded linear map with domain H and (T^ 1 )* = (T*)~ x . 

Theorem 5.4. [1] If P is a projection, then P is self-adjoint, P 2 = P and 
\\P\\ = 1 ifP^Q. 

[2] If P is a self-adjoint operator with P 2 = P, then P is a projection. 

Proof. [1] Let P = P M , and Xi = y\ + Zi for i = 1,2 where y t e M and 
z t e M^. Then Ai^i + X 2 x 2 = (Aij/i + A2J/2) + (Ai^i + X 2 z 2 ) and 

(Aiyi + X 2 y 2 ) g Af, (Ai«i + A 2 z 2 ) G M- 1 . 

It follows that P is linear. To sec that P 2 = P, notice that P 2 x 1 = P(Px 1 ) = 
P(Vi) = Vi = Pxi since y x G M. Notice that ||ari|| 2 - !|yi|| 2 + ||-Ji|| 2 > ||yi|| 2 = 
||Pxi|| 2 so ||P|| < 1. If P ■£ then for any x G M\{0} we have Px = x so ||P|| > 1. 
Self-adjointness is clear: 

(Pxi,x 2 ) = (yi,x 2 ) = (2/1,3/2) = (2:1,2/2) = (xi,Px 2 ). 

[2] Let M = P(H); then M is a linear subspace of i?. If y n = P(x n ), with y„ — > z, 
then Py n — P 2 x n = Px n = Py n , so z — lim„ y n = lim n Py„ = Pz G M so M is 
closed. Since P is self-adjoint and P 2 = P, 

(x - Px, Py) = (Px - P 2 x, 2/) = 
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for all y £ H so x — Px £ M^. This means that x = Px + (x — Px) is the unique 
decomposition of a; as a sum y + z with y £ M and z £ M^. That is, P is the 
projection Pm- □ 

We collect all the elementary properties of projections into the next theorem. 
Projections Pi and P2 are orthogonal if P1P2 = 0. Since projections are self- 
adjoint, PiP 2 = if and only if P 2 Pi = 0. 

The projection Pl is part of the projection Pm if and only if L C M. 

Theorem 5.5. [1] Projections Pm and Ppj are orthogonal if and only if M _L N. 
[2] The sum of two projections Pm and Pm is a projection if and only if PmPn = 0. 
In that case, Pm + Pv = Pm®n- 

[3] The product of two projections Pm and P/v is another projection if and only if 
PmPn = PnPm- In that case, PmPn — Pmhn- 

[4] P L is part of P M Pv/Pl = P L ^ P L P M = P L \\P L x\\ < 

\\P M x\\ V x £ H. 

[5] If P is a projection, then I — P is a projection. 

[6] More generally, P = Pm — Pl is a projection if and only if Pl is a part of Pm ■ 
If so, then P = Pmql- 

PROOF. [1] Let P m Pat = and x £ M,y £ N. Then 

(x,y) = (P M x,P N y) = (P N P M x,y) = 0, 

so M _L N. Conversely, if M ± N then for any x £ H, P N x _L M so P m {Pnx) = 0. 
[2] If P = P M + Pn is a projection, then P 2 = P, so PmPn + PvPm = 0. Hence 

PmPn + PmPnPm = 0, 
after multiplying by Pm on the left. Multiplying on the right by Pm then gives 
2P m P n Pm = so P M P N = 0. 

Conversely, if PmPn = then P/vPm = also, so P 2 = P. Since P is self- 
adjoint, it is a projection. 

Finally, it is clear that (P M + Pv)(P) = M N so P = P m ®n- 
[3] If P = P M Pv is a projection, then P* = P, so P M Pv = (PmPn)* = P n p m = 
PnPm- 

Conversely, let PmPn = PnPm = P- Then P* = P, so P is self-adjoint. 
Also P 2 = PmPnPmPn = PmPn = PmPn — P, so P is a projection. Moreover, 
Px = Pm(Pnx) = P N (P M x) so Px £ M n N. On the other hand, if x £ M n AT 
then Px = P m (Pnx) = Pmx = x so P = Pmhn- 

[4] Assume that P L is part of P M , so L C M. Then P L x £ M for all x £ H. Hence 
P M P L x = P L x, and P Af P L - P L . 
If PmPl = Pl, then 

Pl = P£ = (PmPl)* = PlPm = PlPm, 

so P L P M - P L . 

If PlPm = Pl, then for any x £ H, 

\\P L x\\ = \\P L P M x\\ < ||P L ||||P M a:|| < ||P m x||, 

so that ||P L x|| < \\P M x\\. 

Finally, assume that UPl^H < UPm^II- If there is a point x £ L\M then let 
xq = Jjo + z , yo € M, z L M, and z ^ 0. Then 

IIPl^oII 2 = ||yo|| 2 + Ikoll 2 > ||yo|| 2 = IIPm^oII 2 , 
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so there can be no such point. It follows that L C M so Pi is a part of Pm- 
[5] I - P is self-adjoint, and (I - P) 2 = I- P- P + P 2 = I-P. 
[6] If P is a projection, then by [5] so is / - P = (I - P M ) + Pl- Also by [5], I - P M 
is a projection, so by [2] we must have (L — Pm)Pl = 0. That is, Pl = PmPl- 
Hence, by [4], Pj, is a part of Pm- 

Conversely, if Pl is part of Pm, then Pm — Pl and Pl are orthogonal. By [2], 
the subspace Y of Pm — Pl must therefore satisfy Y (B L — M, so Y — M Q L. □ 



4. Orthonormal sets 



A subset if in a Hilbert space H is orthonormal if each clement of K has norm 
1, and if any two elements of K are orthogonal. An orthonormal set K is complete 
if K 1 - = 0. ' 



H, 



Theorem 5.6. Let {x n } be an orthonormal sequence in H. Then for any x <G 



J2\(x,x n )\ 2 <\\ x \\ 



(48) 



n=l 



The inequality (48) is Bessel's inequality. The scalar coefficients (x,x n ) are 
called the Fourier coefficients of x with respect to {x n }. 



Proof. We have 



^ ^ (x, x n ^)x n 

n=l 



x, ^^(x,x n )x n J 

, n=l J 

' m \ m 

^ ^ (*£; %n)%n7 % J ~t~ ^ ^ (^ ; ^n)(^ni 3?) ; 



\n=l 



SO 



It follows that 



x ^^(x,x n )x n 



Ml 2 - E lO'^l 2 - 



(49) 



n=l 



]Tl(^)l 2 <IWI 2 , 



n=l 



and Bessel's inequality follows by taking m — > oo. 



□ 



The next result shows that the Fourier series of Theorem 5.6 is the best possible 
approximation of fixed length. 

Theorem 5.7. Let {x n } be an orthonormal sequence in a Hilbert space H and 
let {A„} be any sequence of scalars. Then, for any n > 1, 



> 



x ^^(x,x n )x n 



n=l 
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Proof. Write c„ = (x,x n ). Then 



n=l 



n=l 
2 



> N 



~~ ^ ^ ^n c n ~~ ^ ^ ^n c n + ^ ^ \ Ki\ 
n—1 n—1 

m m 

~ l c n| 2 + ^] |c n — A n | 

n=l 
m 

-Ei c «i 2 - 



n=l 



Now apply equation (49). 



□ 



Theorem 5.8. Let {x n } be an orthonormal sequence in a Hilbert space H, and 
let {a n } be any sequence of scalars. Then the series ^a n x n is convergent if and 
only if ^2 \a n \ 2 < oo, and if so 



n=l 



1/2 



(50) 



\n=l 



Moreover, the sum a nX n is independent of the order in which the terms are 
arranged. 



Proof. For m > n we have (by orthonormality) 



a j x j 



UK 



(51) 



Since H is complete, (51) shows the first part of the theorem. Take n = 1 and 
tyl — > oo in (51) to get (50) 

Assume that \ a j\ 2 < 00 an ^ ^ z = J2 a j n x j n be a rearrangement of the 
series x — J2 a j x j- Then 

2 = (x, x) + (z, z) - (a;, z) - (z, x) , 



\\x — z 

and (x,x) — (z,z) — ^ \otj\ 2 . Write 



(52) 



S m — ^ ^ CtjXj, t m ^ ^ 1 



3 = 1 



n=l 



Then 



(x,z) = lim(s m ,t m ) = V |a 3 | 2 . 

m * — ' 

i=i 

Also, (z, x) = (x, z) = (x, z) so (52) shows that \\x — z\\ 2 = and hence x = z. □ 

Theorem 5.9. Let K be any orthonormal set in a Hilbert space H, and for 
each x G H let K x = {y \ y G K, (x, y) ^ 0}. Then: 

(i) for any x e H , K x is countable; 

(ii) the sum Ex = ^ yeK;c {x,y)y converges independently of the order in which 
the terms are arranged; 

(iii) E is the projection operator onto the closed linear space spanned by K. 
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Proof. From Bessel's inequality (48), for any e > there are no more than 
|M| 2 /e 2 points y in K with |(x,y)| > e. Taking e = we see that i^T^ is 

countable for any x. 

Bessel's inequality and Theorem 5.8 show (ii). 

Let < K > denote the closed linear subspace spanned by K . If x _L < K > 
then Ex = 0. If x e < K > then for any e > there are scalars Ai, . . . , A„ and 
elements y\ , . . . , y n € K such that 



< e. 



Then, by Theorem 5.7, 



< e. 



(53) 



Without loss of generality, all of the lie in if x . Arrange the set K x in a sequence 
{j/j}. From (49) notice that the left-hand side of (53) does not increase with n. 
Taking n — > oo, we get ||x — Ex\\ < e. Since e > is arbitrary, we deduce that 
Ex = x for all x e < K >. This proves that E = P < k > - □ 



Definition 
and for every x € H, 



A set K is an orthonormal basis of H is K is orthonormal 



yeK^ 



(54) 



Theorem 5.10. Let K be an orthonormal set in a Hilbert space H . Then the 
following properties are equivalent. 

(i) K is complete; 

(ii) <K> = H; 

(iii) K is an orthonormal basis for H; 

(iv) for any xeH, \\x\\ 2 = Y.yeK.^ \{ x ^)?- 

The equality in (iv) is called Parseval's formula. 

Proof. That (i) implies (ii) follows from Corollary 5.1. Assume (ii). Then by 
Theorem 5.9, Ex = x for all x € H, so K is an orthonormal basis. Now assume (iii). 
Arrange the elements of K x in a sequence {x n }, and take n — > oo in (49) to obtain 
Parseval's formula (iv) . Finally, assume (iv). lfx±K, then ||2:|| 2 -El(^,y)| 2 =0, 
so x = 0. This means that (iv) implies (i). □ 

Theorem 5.11. Every Hilbert space has an orthonormal basis. Any orthonor- 
mal basis in a separable Hilbert space is countable. 

Example 5.2. Classical Fourier analysis comes about using the orthonormal 
basis {e 2ffint } nez for L 2 [0,1]. 

Proof. Let H be a Hilbert space, and consider the classes of orthonormal 
sets in H with the partial order of inclusion. By Lemma A.l there exists a max- 
imal orthonormal set K. Since K is maximal, it is complete and is therefore an 
orthonormal basis. 
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Now let H be separable, and suppose that {x a } is an uncountable orthonormal 
basis. Since, for any a ^ (3, 

\\x a - x fj \\ 2 = \\x a \\ 2 + \\x fj \\ 2 = 2, 

the balls Bi/ 2 (x a are mutually disjoint. If {y n } is a dense sequence in H, then 
there is a ball Bi/ 2 (x ao ) that does not contain any of the points y n . Hence x aa is 
not in the closure of {y n }, a contradiction. □ 

COROLLARY 5.3. Any two infinite-dimensional separable Hilbert spaces are iso- 
metrically isomorphic. 

Proof. Let Hi and H 2 be two such spaces. By Theorem 5.11 there are se- 
quences {x n } and {y n } that form orthonormal bases for Hi and H 2 respectively. 
Given any points x € Hi and y € H 2 , we may write 

oo oo 

x = ^ ' c n x n , y — ^ ' d n x n , (55) 

n— 1 n— 1 

where c„ = (x,x n ) and rf„ = (y,y n ) for all n > 1. Define a map T : ifi — > H 2 by 
Tx = y if c„ = d„ for all n in (55). It is clear that T is linear and it maps Hi onto 
H 2 since the sequences (c„) and (d„) run through all of l 2 . Also, 



Ml 2 , 



n— 1 n— 1 

so T is an isometry. □ 

5. Gram Schmidt orthonormalization 

Starting with any linearly independent set {xi,x 2 , . . .} is a a Hilbert space 
H, we can inductively construct an orthonormal set that spans the same subspace 
by the Gram-Schmidt Orthonormalization process (Theorem 5.12). The idea is 
simple: first, any vector v can be reduced to unit length simply by dividing by 
the length ||v||. Second, if Xi is a fixed unit vector and x 2 is another unit vector 
with {xi,x 2 } linearly independent, then x 2 — (x 2 ,xi)xi is a non-zero vector (since 
Xi and x 2 are independent), is orthogonal to xi (since (xi,x 2 — (x 2 ,xi)xi) — 
(x 1 ,x 2 ) - {x 2 ,xi)(xi,xi) = (xi,x 2 ) - (x 2 ,xi) = 0), and {xi,x 2 - (x 2 ,x 1 )x 1 } spans 
the same space as {xi,x 2 }. This idea can be extended as follows - the notational 
complexity comes about because of the need to renormalize (make the new vector 
unit length). 

We will only need this for sets whose linear span is dense. 

Theorem 5.12. If {xi,x 2 , ■ ■ ■} is a linearly independent set whose linear span 
is dense in H, then the set {(j>i,(j) 2 , . . .} defined below is an orthonormal basis for 
H: 

9i 



Fill 

^ _ x 2 - (X 2 ,<j>i)(/)i 
2 \\X 2 - {X 2 ,<pi)(f>i\\ ' 

and in general for any n > 1, 

^ = %n ^ {Xn, 9l)<f>l - (x n , (j) 2 )cj> 2 {x n ,(j) n _i)(j) n _i 

\\x n - {x n ,4>i)4>i - {x n ,4> 2 )4> 2 (x n ,4> n -i)4> n -i\ 
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The proof is obvious unless you try to write it down: the idea is that at each 
stage the piece of the next vector x n that is not orthogonal to the space spanned 
by {xi, . . . , £„_i} is subtracted. The vector <j> n so constructed cannot be zero by 
linear independence. 

The most important situation in which this is used is to find orthonormal bases 
for certain weighted function spaces. 

Given a < 6, a, 6 e [—00,00] and a function M : (a,b) — > (0,oo) with the 
property that J^t n M(t)dt < 00 for all n > 1, define the Hilbcrt space Lp[a,b] to 

1 /2 

be the linear space of measurable functions / with ||/||m = (/, /)m < 00 where 

(f,9) M = f ' M(t)f(t)g(t)dt. 

J a 

It may be shown that the linearly independent set {1, t, t 2 , t 3 , . . . } has a linear span 
dense in L^ 1 . The Gram-Schmidt orthonormalization process may be applied to 
this set to produce various families of classical orthonormal functions. 

Example 5.3. [1] If M(t) = 1 for all t, a = -1, 6=1, then the process 
generates the Legendre polynomials. 

[2] If M{t) = yy=j, a = — 1) 6 = 1, then the process generates the Tchebychev 
polynomials. 

[3] If M(t) = t'-^l - ty-i, a = 0, 6 = 1 (with q > and p - q > -1), then the 
process generates the Jacobi polynomials. 

[4] If M(t) = e _t , a = —00, 6 = 00, then the process generates the Hermite 
polynomials. 

[5] If M(t) = e~*, a = 0, 6 = 00, then the process generates the Laguerre polyno- 
mials. 



CHAPTER 6 



Fourier analysis 



In the last chapter we saw some very general methods of "Fourier analysis" in 
Hilbcrt space. Of course the methods started with the classical setting on periodic 
complex valued functions on the real line, and in this chapter we describe the 
elementary theory of classical Fourier analysis using summability kernels. The 
classical theory of Fourier series is a huge subject: the introduction below comes 
mostly from Katznclson 1 and from Korner 2 ; both are highly recommended for 
further study. 



1. Fourier series of L\ functions 

Denote by £i(T) the Banach space of complex- valued, Lcbcsgue intcgrable 
functions on T = [0, 27r)/0 ~ 27r (this just means periodic functions). 
Modify the Li-norm on this space so that 



"*\f(t)\dt. 



What is going on here is simply this: to avoid writing "27r" hundreds of times, we 
make the unit circle have "length" 2w. To recover the useful normalization that the 
Li-norm of the constant function 1 is 1, the usual Li-norm is divided by 2tt. 

Notice that the translate f x of a function has the same norm, where f x (t) = 
f{t-x). 

Definition 6.1. A trigonometric polynomial on T is an expression of the form 

JV 

P(t)= a ne mt , 

n=-N 

with o„ G C. 

Lemma 6.1. The functions {e m *}„ e z are pairwise orthogonal in L 2 - That is, 

1 e int e -imt dt = 1 f 2 * e i(n-m)t dt = f 1 »/» = "»> 

2tt J 2tt J [ ifn^m. 

It follows that if the function P(t) is given, we can recover the coefficients a n 
by computing 



1 r v 

— \ P(t)e- mt dt. 



1 An introduction to Harmonic Analysis, Y. Katznclson, Dover Publications, New York 
(1976). 

2 Fourier Analysis, T. Korner, Cambridge University Press, Cambridge. 
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It will be useful later to write things like 

N 



N 

int 



n=—N 



which means that P is identified with the formal sum on the right hand side. The 
expression P(t) = . . . is a function defined by the value of the right hand side for 
each value of t. 

Definition 6.2. A trigonometric series on T is an expression 

oo 

S~ ]T a n e mt . (56) 

n— — oo 

The conjugate of S is the series 

oo 

S~ J2 -isign(n)a n e mt (57) 

n— — oo 

where sign(n) = if n = and = n/\n\ if not. 

Notice that there is no assumption about convergence, so in general S is not 
related to a function at all. 

Definition 6.3. Let / e Li(T). Define the nth (classical) Fourier coefficient 
of / to be 



f(n)=-Jf(t)e- mt dt (58) 

(the integration is from to 2n as usual). Associate to / the Fourier series S[f], 
which is defined to be the formal trigonometric series 

oo 

S[f}~ £ f(n)e mt . (59) 

n— — oo 

We say that a given trigonometric series (56) is a Fourier scries if it is of the 
form (59) for some / G L\(T). 

Theorem 6.1. Let f,g e Li(T). Then 
[11 {7+~9){n) = f{n)+g{n). 
[2J ForXeC, (A/)(n) = Xf(n). 



[3] If f(t) — (f(t) is the complex conjugate of f then f(n) = f(—n). 
[4] If fx(t) = f(t — x) is the translate of f , then f x {n) = e~ mx f(ji). 
[5]\f(n)\<i v J\f(t)\dt=\\f\\i. 

Prove these as an exercise. 

Notice that / i— > / sends a function in Li(T) to a function in C(Z), the con- 
tinuous functions on Z with the sup norm. This map is continuous in the following 
sense. 

Corollary 6.1. Assume (fj) is a sequence in Li(T) with \\fj — f\\i — > 0. 
Then fj — > / uniformly. 

PROOF. This follows at once from Theorem 6.1 [5]. □ 



2. CONVOLUTION IN L x 
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Theorem 6.2. Le£ / e Li(T) We /(0) = 0. De/zne 

F(t)= /'/(*)<**■ 
Jo 

T/ien f 1 is continuous, 2tt periodic, and 



F(n) = -f(n) 
in 



for all n^O. 



Proof. It is clear that F is continuous since it is the integral of an L x function. 
Also, 

ft+2-n 



f(s)ds = 27r/(0) = 0. 
Finally, using integration by parts 

F(n) = 7T- / F(t)e~ mt dt = -— F 1 \t)—e' mt dt = —fn. 
y ' 2tt J w 2tt Jo K ' in in 



□ 



Notice that we have used the symbol F' - the function F is differentiable 
because of the way it was defined. 

2. Convolution in L\ 

In this section we introduce a form of "multiplication" on Li(T) that makes 
it into a Banach algebra (see Definition 3.3). Notice that the only real properties 
we will use is that the circle T is a group on which the measure ds is translation 
invariant: 



J fx(s)ds = J fds. 



Theorem 6.3. Assume that f,g are in Li(T). Then, for almost every s, the 
function f(t — s)g(s) is integrable as a function of s. Define the convolution of f 
and g to be 

(F*g){t) = ± J f(t-s)g(s)ds. (60) 

Then f * g e Li(TT), with norm 

||/*<7||i<||/||iN|i. 

Moreover 

(f*9)(n) = f{n)g{n). 

Proof. It is clear that F(t, s) = f(t — s)g(s) is a measurable function of the 
variable (s, t). For almost all s, F(t, s) is a constant multiple of f s , so is integrable. 
Moreover 

hi (hi ^ s ^ dt ) ds = hl i^)in/iii^ = ii/iiiNii- 

So, by Fubini's Theorem 4.4, f(t — s)g(s) is integrable as a function of s for almost 
all t, and 

^ J \{f*g){t)\dt=± r J ± jF(t,s)ds dt<^J J |F(M)M^HI/lliNli, 
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showing that ||/ * g\\i < ||/||i||5||i- Finally, using Fubini again to justify a change 
in the order of integration, 

{Rg)(n) = ^J(f*g)(t)e- int dt = ^ J J f(t - s) e -" 1 ^ g( S )dtds 

= f(n)g(n). 

□ 

Lemma 6.2. The operation (f,g) i— » f * g is commutative, associative, and 
distributive over addition. 

Prove this as an exercise. 

Lemma 6.3. If f e Li(T) and k(t) = Y,n=~N a ne mt then 



N 

Ant 



(fc */)(*)= E a nf(n)e ir ' 

Thus convolving with the function e mt picks out the nth Fourier coefficient. 
Proof. Simply check this one term at a time: if Xn(t) = e mt , then 

(Xn * /)(*) = ^ J e in{t - s) f{s)ds = e mt ^ J f(s)e- ins ds. 

□ 

3. Summability kernels and homogeneous Banach algebras 

Two properties of the Banach space Li(T) are particularly important for Fourier 
analysis. 

Theorem 6.4. /// e Li(T) and xeT, then 

f x (t) = f(t-x)eL 1 (T) andM^ = ||/||i. 

Also, the function x f x is continuous on T for each f e Li(T). 

Proof. The translation invariance is clear. 

In order to prove the continuity we must show that 

hm \\f x -f XQ h = 0. (61) 

X— >XQ 

Now (61) is clear if / is continuous. On the other hand, the continuous functions 
are dense in £i(T), so given / e Li(T) and e > we may choose g € C(T) such 
that 

llff-/Hl<€. 

Then 

||/x-/x ||i < ll/x -9x\\i + \\9x ~ fx Q 111 + 115x0 - fx 111 

= ll(/-0)x||l + ||2x-0x o ||l + ||(S-/)x o ||l 

< ^ + \\9x ~ 9x \\i- 

It follows that 

limsupH/^ - /^Jli < 2e, 



3. SUMMABILITY KERNELS AND HOMOGENEOUS BANACH ALGEBRAS 
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so the theorem is proved. □ 

Definition 6.4. A summability kernel is a sequence (fc„) of continuous 2tt- 
periodic functions with the following properties: 

/ k n (t)dt = 1 for all n. (62) 
There is an R such that f \k n (t)\dt < R for all n. (63) 

27T 7 



For all (5 > 0, lim / \k n (t)\dt = 0. 



(64) 



If in addition fc n (£) > for all n and i then (fc„) is called a positive summability 
kernel. 

Theorem 6.5. Let f e Li(T) and let (k n ) be a summability kernel. Then 
f = lim ^- [ k n (s)f s ds 

in the L\ norm. 

PROOF. Write 0(s) = f s (t) = f(t - s) for fixed t. By Theorem 6.4 is 
a continuous Li(T)-valued function on T, and 0(0) = /. We will be integrating 
Li(T)-valued functions - see the Appendix for a brief definition of what this means. 

Then for any < 5 < ir, by (62) we have 



^ J k n (s)(j>(s)ds - <j>(0) = i- J fc„( S )(0(s)-0(O))ds 

= ^ |%„( s )(0( s )-0(O))d S 

fc n (s)(0(s)-0(O))cis. 



2tt 

The two parts may be estimated separately: 

r 

2tt 

and 



X - [ k n (s)(4>(s)-4>(0))ds\\i <max||^( S )-^(0)||i||fc n ||i, (65) 
!tt J-s \s\<S 



||^- 5 fc„(a) (0(s) - 0(0)) da||i < max ||0( S ) - 0(0) ||i^- V(s)|da. 

Using (63) and the fact that is continuous at s = 0, given any e > there is 
a 5 > such that (65) is bounded by e. With the same 5, (64) implies that (66) 
converges to as n — > oo, so that ^ J" k n (s)(p(s)ds — 0(0) is bounded by 2e for 
large n. □ 

The integral appearing in Theorem 6.5 looks a bit like a convolution of Li (Un- 
valued functions. This is not a problem for us. Consider first the following lemma. 

Lemma 6.4. Let k be a continuous function on T, and f G L±(T). Then 

J k(s)f s ds = k*f. (67) 
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PROOF. Assume first that / is continuous on T. Then, making the obvious 
definition for the integral, 

j 

with the limit taken in the L\ (T) norm as the partition of T defined by {si , . . . , Sj , . . . } 
becomes finer. On the other hand, 

-!- lim^(s j+ i - s 3 )k{s )f{t - Sj ) = (k * f)(t) 

3 

uniformly, proving the lemma for continuous /. 

For arbitrary / e £<i(T), fix e > and choose a continuous function g with 
||/- 5 ||i <c Then 

2 7 / Hs)fsds - k *f=^J Hs)(f - g)sds + k*{g-f), 



so 



2tt 



k(s)f s ds — k*f 



<2\\k\Ue. 



□ 



Lemma 6.4 means that Theorem 6.5 can be written in the form 

/ = lim k n * f in L x . (68) 



4. Fejer's kernel 

Define a sequence of functions 

lil 



K n (t)= I 1 



n + l 



ijt 



Lemma 6.5. The sequence (K n ) is a summability kernel. 

Proof. Property (62) is clear. 
Now notice that 



n+l \ 4 



-i(n+l)t 



1 1 

2~4 f 



,i(n+l)< 



On the other hand, 



sin 



t _ 1 
2 ~ 2 



(1 - cost) = --e~ u + 



1 _« 1 1 



2 4 



so 



K n {t) 



1 fsin^i 



n+l I sin \t 



Property (64) follows, and this also shows that K n (t) > for all n and t. 
Prove property (63) as an exercise. 



(69) 



□ 



4. FEJER'S KERNEL 

The following graph is the Fejer kernel K\\. 
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Definition 6.5. Write a n (f) = K n * f. 
Using Lemma 6.3, it follows that 

*«(/)(*) = E (l - ^r^fr) /C?)e* J "*, (70) 

j=-n 

and (68) means that 

°nU) - / 

in the Li norm for every / G Zi(TT). It follows at once that the trigonometric 
polynomials are dense in Li(T). The most important consequences are however 
more general statements about Fourier series. 

Theorem 6.6. If f,g € Z-i(T) /iat;e f(n) = g(n) for all tieZ, then f = g. 

Proof. It is enough to show that f(n) — for all n implies that / = 0. Using 
(70), we see that if f(n) = for all n, then a n (f) = for all n; since a n (f) — ► /, it 
follows that / = 0. □ 

COROLLARY 6.2. The family of functions {e mt } ne % form a complete orthonor- 
mal system in Z^QT). 

Proof. It is enough to notice that 

(f,e int ) = f(n). 

Then for all / e L 2 (T), the function / and its Fourier series have identical Fourier 
coefficients, so must agree. □ 

We also find a very general statement about the decay of Fourier coefficients: 
the Riemann- Lebesgue Lemma. 

Theorem 6.7. Let f e Li(T). TTien lim|„ Hoo /(n) = 0. 

Proof. Fix an e > 0, and choose a trigonometric polynomial P with the 
property that 

II/- -Pill <e. 
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If \n\ exceeds the degree of P, then 

\f(n)\ = \(f^P)(n)\<\\f-P\\ 1 <e. 

□ 

Recall that for / G £i(T), the Fourier series was defined (formally) to be 

oo 

S[f]~ £ f(n)e int , 

TL— — OO 

and the nth partial sum corresponds to the function 

n 

S n (f)(t) = £ f{j)e ijt . (71) 

j=-n 

Looking at equations (71) and (70), we see that cr„(/) is the arithmetic mean of 
S Q (f),S 1 (/),..., S n (f): 

*»(/) = -^T7 (W) + + ' ' ' + S n(f)) • (72) 
n + 1 

It follows that if S n (f) converges in Li(T), then it must converge to the same thing 
as a n , that is to / (if this is not clear to you, look at Corollary 6.3 below. 

The partial sums £*„(/) also have a convolution form: using (70) we have that 
Sn(f) = D n * f where (D n ) is the Dirichlet kernel defined by 

sin(n + \)t 



D n (t) = J2 e 



ijt 



sin \ t 

j=-n ^ 

Notice that (D n ) is not a summability kernel: it has property (62) but does not 
have (63) (as we saw in Lemma 3.3) nor does it have (64). This explains why the 
question of convergence for Fourier series is so much more subtle than the problem 
of summability. The following graph is the Dirichlet kernel Dn. 



Definition 6.6. The de la Vallee Poussin kernel is defined by 
V n (t) = 2K 2n+1 (t)-K n (t). 
Properties (62), (63) and (64) are clear. 



5. POINTWISE CONVERGENCE 

The next picture is the de la Vallcc Poussin kernel with n = 11. 
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This kernel is useful because V n is a polynomial of degree 2n+ 1 with V n (j) = 1 
for \j\ < n + 1, so it may be used to construct approximations to a function / 
by trigonometric polynomials having the same Fourier coefficients as / for small 
frequencies. 

5. Pointwise convergence 

Recall that a sequence of elements (x n ) in a normed space (X, || • ||) converges 
to x if \\x n — x\\ — > as n — > oo. If the space X is a space of complex-valued 
functions on some set Z (for example, Li(T), C(T)), then there is another notion 
of convergence: x n converges to x pointwise if for every z e Z, x n (z) — » 
as a sequence of complex numbers. The question addressed in this section is the 
following: does the Fourier series of a function converge pointwise to the original 
function? 

In the last section, we showed that for L\ functions on the circle, a n (f) con- 
verges to / with respect to the norm of any homogeneous Banach algebra containing 
/. Applying this to the Banach algebra of continuous functions with the sup norm, 
we have that a n (f) — > / uniformly for all / e C(T). 

If the function / is not continuous on T, then the convergence in norm of <?„(/) 
does not tell us anything about the pointwise convergence. In addition, if <J n (f,t) 
converges for some t, there is no real reason for the limit to be /(f). 

Theorem 6.8. Let f be a function in Li(T). 

(a) // 

lim (/(f + h) + f(t-h)) 

h— >0 

exists (the possibility that the limit is =boo is allowed), then 

a n (f, t) — > \ lim^o (f(t + h) + f(t - h)) . 

(b) /// is continuous at t, then a n (f,t) — > /(f). 

(c) If there is a closed interval I C T on which f is continuous, then a n (f, •) 
converges uniformly to f on I . 

Corollary 6.3. If f is continuous att, and if the Fourier series of f converges 
at t, then it must converge to /(f). 
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Proof. Recall equation (72): 

*»(/) = (So(f) + 5i(.f) + • • • + S n (f)) • 

n + 1 

By assumption and (b), a n (f,t) — > /(f) and S n (f,t) — > 5(f) say. Write the right 
hand side as 

-J- (5 (f) + 5x(f) + • • • + 5^(f)) + (^s +1 (t) + • • • + 5„(f)) . 

tit | _L T~b | _1_ 

The first term converges to zero as n — > oo (since the convergent sequence (5„(t)) 
is bounded). For the second term, choose and fix e and choose n so large that 

\S k {t)-S{t)\ <e 

for all k >\fn. Then the whole second term is within ^ "^f ^ e of 5(f). It follows 
that 

—^rj (So(f) + Si(f) + ■■■ + 5„(/)) - 5(f) 
n + 1 

as n — > oo, so 5(f) must coincide with lim n ^oo a n (f,t) — /(f). □ 

Turning to the proof of Theorem 6.8, recall that the Fejer kernel (K n ) (see 
Lemma 6.5) is a positive summability kernel with the following properties: 



lim sup K n {t) = for any e (0,tt), (73) 

n^oo \ 0<t<2T _6l / 

and 

K n (t) = K n (-t). (74) 

Proof of Theorem 6.8. Define 

/(f) = lim !(/(* + ft) +/( t -/0), 

and assume that this limit is finite (a similar argument works for the infinite cases) . 
We wish to show that a n (f,t) — /(f) is small for large n. Evaluate the difference, 

Vn(f,t)-f(t) = ^^K n (r)(f(t-r)-f(t))dr 
= ^ j\ n {r){f{t-T)-f{t))dT 

/■27T-0 

+ / K n (r) (/(f - r) - /(f)) dr. 



Applying (74) this may be written 

*.</,.) -to- \ KM (^±^±I> -/(«)) *. 



Fix e > 0, and choose S e (0, 7r) small enough to ensure that 

re (-0,0) => |/(^I)+^±I) 
and choose iV large enough to ensure that 



/(*) 



< C (76) 



n > N => sup K n (r) < e. (77) 

6<t<2tt-0 



6. LEBESGUE'S THEOREM 
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Putting the estimates (76) and (77) into the expression (75) gives 
\<r n (f,t)-f(t)\ <e + e\\f-f(t)\\ u 

which proves (a). 

Part (b) follows at once from (a). 

For (c), notice 3 that / must be uniformly continuous on /. This means that 
(given e > 0) 9 can be chosen so that (76) holds for all t € / and N depends only 
on 9 and e. This means that a uniform estimate of the form 

k« (/,*)-/(*) | <e + e||/-/(i)|| 1 , 
can be found for all tel. □ 

6. Lebesgue's Theorem 

The Fejer condition, that 

/W-fa ^V-*' (78) 

exists is very strong, and is not preserved if the function / is modified on a null 
set. This means that property (78) is not really well defined on L\. However, (78) 
implies another property: there is a number f(t) for which 

'* I J(t + h) + f(t -h) 



h^o h J 



-fit) 



dr = 0. (79) 



2 

This is a more robust condition, better suited to integrable functions 4 . 

Theorem 6.9. /// has property (79) at t, then a n (f,t) — > f(t). In particular 
(by the footnote), for almost every value oft, cr n (f,t) f{t). 

Corollary 6.4. If the Fourier series of f e £i(TT) converges on a set F of 
positive measure, then almost everywhere on F the Fourier series must converge to 
f. In particular, a Fourier series that converges to zero almost everywhere must 
have all its coefficients equal to zero. 

Remark 6.1. The case of trigonometric series is different: a basic counter- 
example in the theory of trigonometric series is that there are non-zero trigono- 
metric series that converge to zero almost everywhere. On the other hand, a trigono- 
metric series that converges to zero everywhere must have all coefficients zero 5 . 



Proof of 6.9. Recall the expression (75) in the proof of Theorem 6.8, 

r. 

(80) 



«.</,.)-/(«>-! / + f *.M ( /( '- T) t /( ' + T) -/W I " 



Also, by (69), 



2 T 



3 A continuous function on a closed bounded interval is uniformly continuous. 

4 There are functions / with the property that Fejer's condition (78) does not hold anywhere, 
but (79) does hold for any / £ Li(T), for almost all t with f(t) = f(t). This is described in 
volume 1 of Trigonometric Series, A. Zygmund, Cambridge University Press, Cambridge (1959). 

5 See Chapter 5 of Ensembles parfaits et series trigonomctriqucs, J.-P. Kahane and R. Salem, 
Hermann (1963). 
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and sin 5 > - for < r < tt, so 

2 TT 7 



K n (r) < min < n + 1, 



(„ + l)r"- (82) 

It follows that the second integral in (80) will converge to zero so long as (n + l)6 2 
does. Pick 6 = rT x l A ; this guarantees that as n — > oo the second integral tends to 
zero. 

Now consider the first integral. Write 

■ h f( t + h)+f(t-h) 



9(h) = [ 
Jo 



-/(*) 



Then 



/(t + r) + /(t-r) 



-/(*) 



dr. 



is bounded above by 



1 


,1/n 


1 




< ™ + 






+ - 


/ 




7T 


Jo 


7T 


Jl/n 


7T 



V 77, / 



+ - 



7T Z" 9 

tu 1Ai 



/(t + r) + /(t-r) 



/(*) 



rJr 

^2 



(we have used the estimate for K n from (82)). By the assumption (79), the first 
term ^±i\I/(i) tends to zero. Apply integration by parts to the second term gives 

9 



n + l Jl/n 



f(t + r) + f(t-r) 



/(*) 



dr 

^2 



n+l 



*(r) 



2tt 



l/n 



*(r) 



dr. 



(83) 



For given e > and n > n(e) (79) gives 

*(r) < er for t e (0, (9 = n~ 1/4 ). 
It follows that (83) is bounded above by 

rJr 



iren 2ire 
n+l n + 



-f 



< 3ire, 



which completes the proof. 



□ 



APPENDIX A 



1. Zorn's lemma and Hamel bases 



Definition A. 1 . A partially ordered set or poset is a non-empty set S together 
with a relation < that satisfies the following conditions: 

(i) x < x for all x G S; 

(ii) if x < y and y < z then x < z for all x,y, z <G S. 

If in addition for any two elements x, y of S at least one of the relations x < y 
or y < x holds, then we say that S is a totally ordered set. 

The set of subsets of a set X, with < meaning inclusion, defines a partially 
ordered set for example. 

Definition A. 2. Let S be a partially ordered set, and T any subset of S. An 
element x € 5 is an upper bound of T if y < x for all y ST. 

Definition A. 3. Let 5 be a partially ordered set. An element S* e 5 is maxi- 
mal if for any y £ S, i< y =^> y < x. 

The next result, Zorn's lemma, is one of the formulations of the Axiom of 
Choice. 

Theorem A.l. If S is a partially ordered set in which every totally ordered 
subset has an upper bound, then S has a maximal element. 

This result is used frequently to "construct" things - though whenever we use it 
all we really are able to do is assert that something must exist subject to assuming 
the Axiom of Choice. An example is the following result - as usual, trivial in finite 
dimensions. 

To see that the following theorem is "constructing" something a little surprising, 
think of the following examples: R is a linear space over Q; L 2 [0, 1] is a linear space 
over E. 

Theorem A. 2. let X be a linear space over any field. Then S contains a 
set A of linearly independent elements such that the linear subspace spanned by A 
coincides with X. 

Any such set A is called a Hamel basis for X. It is quite a different kind of 
object to the usual spanning set or basis used, where X is the closure of the span 
of the basis. If the Hamel basis is A — {^aIagA, then every element of X has a 
(unique) representation 




in which the sum is finite and the the a\ are scalars. 
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Proof. Let S be the set of subsets of X that comprise linearly independent 
elements, and write S = {A, B, C, . . . }. Define a partial ordering on S by A < B if 
and only if A C B. 

We first claim that if {A a } is a totally ordered subset of S, it has the upper 
bound B = U a A a . In order to prove this, we must show that any finite number 
of elements xi, . . . , x„ of B are linearly independent. Assume that Xi G A ai for 
i = 1, . . . ,n. Since the set {A a } is totally ordered, one of the subsets A aj con- 
tains all the others. It follows that {xi, . . . , x„} C A aj , so Xi, . . . , x„ are linearly 
independent. 

We may therefore apply Theorem A.l to conclude that 5* has a maximal element 
A. If y G X is not a finite linear combination of elements of A, then the set 
B = A U {y} belongs to S (since it is linearly independent), and A < B, but it is 
not true that B < A, contradicting the maximality of A. 

It follows that every element of X is a finite linear combination of elements of 
A. □ 



2. Baire category theorem 

Most of the facts assembled here are really about metric spaces - normed spaces 
are a special case of metric spaces. 

A subset S C X of a normed space is nowhere dense if for every point x in the 
closure of S, and for every e > B e (x) n (X\S) is non-empty. 

The diameter of S C X is defined by 

diam(5) = sup ||a— 6||. 

a,b£S 

Theorem A. 3. Let {F n } be a decreasing sequence of non-empty closed sets 
(this means F n D F n+ i for all n) in a complete normed space X. If the sequence 
of diameters diam(F n ) converges to zero, then there exists exactly one point in the 
intersection H^Li ^n- 

Proof. If x and y are both in the intersection, then by the definition of the 
diameter, ||x — y|| < diam(F„) —> so x = y. It follows that there can be no more 
than one point in the intersection. 

Now choose a point x„ € F n for each n. Then ||x„ — x m \\ < diam(F„) — > 
as n > m — > oo. Thus the sequence (x„) is Cauchy, so has a limit x say by 
completeness. For any n, F n is a closed set that contains all the x m with m > n, 
so x e F n . It follows that x e f|~=i F n- □ 

The next result is a version of the Baire 1 category theorem. 

Theorem A. 4. A complete normed space cannot be written as a countable 
union of nowhere dense sets. 

In the langauge of metric spaces, this means that a complete normed space is 
of second category. 



1 Rene Baire (1874-1932) was one of the most influential French mathematicians of the early 
20th century. His interest in the general ideas of continuity was reinforced by Volterra. In 1905, 
Baire became professor of analysis at the Faculty of Science in Dijon. While there, he wrote an 
important treatise on discontinuous functions. Baire's category theorem bears his name today, as 
do two other important mathematical concepts, Baire functions and Baire classes. 



2. BAIRE CATEGORY THEOREM 



7.3 



PROOF. Let X be a complete normed space, and suppose that X = U°^L 1 Xj 
where each Xj is nowhere dense (that is, the sets Xj all have empty interior). Fix 
a ball Bi(xo)- Since Xi does not contain Bi(xq) there must be a point x\ e i?i(a;o) 
with x\ ^ X\. It follows that there is a ball B ri (xi) such that B ri (x\) C -Bi(£o) 
and B ri (xi) n = 0. Assume without loss of generality that r x < \. 

Similarly, there is a point X2 and a radius r 2 such that B r2 (x2) C B ri (xi), and 
6r 2 (^)nl2 = 0, and without loss of generality r 2 < |. Notice that S r2 (x2)nA ? i = 
automatically since B r2 (x 2 ) C B ri (xi). 

Inductively, we construct a sequence of decreasing closed balls B Tn (x n ) such 
that B rn (x n ) fl Xj = for 1 < j < n, and r„ — > as n — > oo. 

Now by Theorem A. 3, there must be a point x in the intersection of all the 
closed balls B rn (x n ), so x £ Xj for all j > 1. This implies that x £ Uj>\Xj = X, 
a contradiction. □ 



